From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id BD728A0AC80; Tue, 9 Apr 2024 14:07:29 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BD728A0AC80 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1712660849; bh=w1dDQsMMOc764evg2OdJAdiDprRsjg2SvgxKN2Yj8kA=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=qyJGRiUX4f6hONVS+Nl3VgyEIQFk4flEuagdkCJZ14p77FadI6/8Eu8M67/JS9aEA +ylLRQk5X88CNTPUTA3h+4Fsg0QpUGnKvVApGDLwr4yHX3VPxC1Wk1HLyWbaDi7bZS ZjDUro/ywfjYv1iZuTUKQPfum2H2vlmh60fx4vdQ= Received: from smtp39.i.mail.ru (smtp39.i.mail.ru [95.163.41.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id DEB59A0AC80 for ; Tue, 9 Apr 2024 14:07:28 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org DEB59A0AC80 Received: by smtp39.i.mail.ru with esmtpa (envelope-from ) id 1ru9K3-0000000DC0h-11fb; Tue, 09 Apr 2024 14:07:28 +0300 Content-Type: multipart/alternative; boundary="------------SUPZZHcVC6xhCPA290klyD7m" Message-ID: Date: Tue, 9 Apr 2024 14:07:26 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Sergey Kaplun References: <9a71bf0765acd6ab019de4ae9f491a6c7bcb463d.1710416150.git.sergeyb@tarantool.org> Content-Language: en-US In-Reply-To: X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9D327C87852EB66D3342A6001465C01AB48AF3E861E45F6F6182A05F538085040E4642D1DDC0D7A84AC8EDD30083ED68E82243196EBD2B223CB82B7C8BF33CEE9BB3D1907BEA9FCEE X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE70ED3881ADD6CEF6AEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637EA3E6C6690B39D9B8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D86482BD16B07FDC06B2DE1AE734533CF7CF2488994885B041CC7F00164DA146DAFE8445B8C89999728AA50765F7900637F6B57BC7E64490618DEB871D839B7333395957E7521B51C2DFABB839C843B9C08941B15DA834481F8AA50765F7900637D0FEED2715E18529389733CBF5DBD5E9B5C8C57E37DE458BD9DD9810294C998ED8FC6C240DEA76428AA50765F7900637486022F0596659D4D81D268191BDAD3DBD4B6F7A4D31EC0BE2F48590F00D11D6D81D268191BDAD3D78DA827A17800CE784DEC4DD72B68C57EC76A7562686271ED91E3A1F190DE8FD2E808ACE2090B5E14AD6D5ED66289B5278DA827A17800CE7390FB772827C5A6B2EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE7EBD83694CB3B8CDE731C566533BA786AA5CC5B56E945C8DA X-C1DE0DAB: 0D63561A33F958A547125996E6119D105002B1117B3ED696D931FA4C67381F5F361FAC1196A180DE823CB91A9FED034534781492E4B8EEAD3BF9A24AEF94352CBDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADE00A9FD3E00BEEDF3FED46C3ACD6F73ED3581295AF09D3DF87807E0823442EA2ED31085941D9CD0AF7F820E7B07EA4CF788B46B559598C0BC0F813EE7956298CF81A431713A9CF48F493DC7FD9324D7F8853F4E7B596232843A9A4803F78FBB95C7A4117A6CB87C695BBED43AF83D5F14502C5F85AEFE746C226CC413062362A913E6812662D5F2AB9AF64DB4688768036DF5FE9C0001AF333F2C28C22F508233FCF178C6DD14203 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojRxVvDy3pXeruwFrp6lA6Yg== X-Mailru-Sender: 520A125C2F17F0B1A9638AD358559B594FC308E575ECDACFAC8EDD30083ED68E82243196EBD2B223B7CBEF92542CD7C8795FA72BAB74744FC77752E0C033A69EA16A481184E8BB1C9B38E6EA4F046BE03A5DB60FBEB33A8A0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit 1/2] OSX/iOS/ARM64: Fix generation of Mach-O object files. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Cc: Sergey Bronnikov , tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------SUPZZHcVC6xhCPA290klyD7m Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hi, Sergey thanks for review! Please see my comment below. On 08.04.2024 18:01, Sergey Kaplun wrote: > Hi, Sergey! > Thanks for the patch! > I'll proceed with the version from the branch. > > Please consider my comments below. > > Please rebase your branch to the tarantool/master and solve the > conflicts. > > On 26.03.24, Sergey Bronnikov wrote: >> Subject: [PATCH 2/3] OSX/iOS/ARM64: Fix generation of Mach-O object files. >> >> Thanks to Carlo Cabrera. >> >> (cherry picked from commit 3065c910ad6027031aabe2dfd3c26a3d0f014b4f) >> > > >> Part of tarantool/tarantool#9595 >> --- >> .github/workflows/exotic-builds-testing.yml | 6 +- >> src/jit/bcsave.lua | 6 +- >> test/LuaJIT-tests/CMakeLists.txt | 9 + >> test/LuaJIT-tests/lib/ffi/index | 2 +- >> ...generation_of_mach-o_object_files.test.lua | 343 ++++++++++++++++++ >> 5 files changed, 361 insertions(+), 5 deletions(-) >> create mode 100644 test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua >> >> diff --git a/.github/workflows/exotic-builds-testing.yml b/.github/workflows/exotic-builds-testing.yml >> index 859603bd..9cace0bc 100644 >> --- a/.github/workflows/exotic-builds-testing.yml >> +++ b/.github/workflows/exotic-builds-testing.yml >> @@ -34,7 +34,7 @@ jobs: >> BUILDTYPE: [Debug, Release] >> ARCH: [ARM64, x86_64] >> GC64: [ON, OFF] >> - FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind] >> + FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind, avx512] > Please sort entries alphabetically. Done. --- a/.github/workflows/exotic-builds-testing.yml +++ b/.github/workflows/exotic-builds-testing.yml @@ -34,7 +34,7 @@ jobs:          BUILDTYPE: [Debug, Release]          ARCH: [ARM64, x86_64]          GC64: [ON, OFF] -        FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind, avx512] +        FLAVOR: [ avx512, checkhook, dualnum, gdbjit, nojit, nounwind ]          include:            - BUILDTYPE: Debug              CMAKEFLAGS: -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON >> include: >> - BUILDTYPE: Debug >> CMAKEFLAGS: -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON >> @@ -50,12 +50,16 @@ jobs: >> FLAVORFLAGS: -DLUAJIT_USE_GDBJIT=ON >> - FLAVOR: nounwind >> FLAVORFLAGS: -DLUAJIT_NO_UNWIND=ON >> + - FLAVOR: avx512 >> + CMAKEFLAGS: -DCMAKE_C_FLAGS=-march=skylake-avx512 -DCMAKE_C_COMPILER=gcc > Is there any guarantee that the given runner has avx512 support? > If not, our tests will fail due to illegal instruction error. I'll add a check for AVX support and condition that will skip job if there is no AVX support. >> exclude: >> - ARCH: ARM64 >> GC64: OFF >> # DUALNUM is default for ARM64, no need for additional testing. >> - FLAVOR: dualnum >> ARCH: ARM64 >> + - FLAVOR: avx512 >> + ARCH: ARM64 > Please sort entries alphabetically. Fixed: --- a/.github/workflows/exotic-builds-testing.yml +++ b/.github/workflows/exotic-builds-testing.yml @@ -34,12 +34,14 @@ jobs:          BUILDTYPE: [Debug, Release]          ARCH: [ARM64, x86_64]          GC64: [ON, OFF] -        FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind, avx512] +        FLAVOR: [ avx512, checkhook, dualnum, gdbjit, nojit, nounwind ]          include:            - BUILDTYPE: Debug              CMAKEFLAGS: -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON            - BUILDTYPE: Release              CMAKEFLAGS: -DCMAKE_BUILD_TYPE=RelWithDebInfo +          - FLAVOR: avx512 +            CMAKEFLAGS: -DCMAKE_C_FLAGS=-march=skylake-avx512 -DCMAKE_C_COMPILER=gcc            - FLAVOR: dualnum              FLAVORFLAGS: -DLUAJIT_NUMMODE=2            - FLAVOR: checkhook >> runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}'] >> name: > >> LuaJIT ${{ matrix.FLAVOR }} >> diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua >> index a287d675..7aec1555 100644 >> --- a/src/jit/bcsave.lua >> +++ b/src/jit/bcsave.lua >> @@ -446,18 +446,18 @@ typedef struct { >> uint32_t value; >> } mach_nlist; >> typedef struct { >> - uint32_t strx; >> + int32_t strx; >> uint8_t type, sect; >> uint16_t desc; >> uint64_t value; >> } mach_nlist_64; >> typedef struct >> { >> - uint32_t magic, nfat_arch; >> + int32_t magic, nfat_arch; >> } mach_fat_header; >> typedef struct >> { >> - uint32_t cputype, cpusubtype, offset, size, align; >> + int32_t cputype, cpusubtype, offset, size, align; >> } mach_fat_arch; >> typedef struct { >> struct { >> diff --git a/test/LuaJIT-tests/CMakeLists.txt b/test/LuaJIT-tests/CMakeLists.txt >> index a0fb5440..29146113 100644 >> --- a/test/LuaJIT-tests/CMakeLists.txt >> +++ b/test/LuaJIT-tests/CMakeLists.txt >> @@ -64,6 +64,15 @@ if(LUAJIT_NO_UNWIND) >> set(LUAJIT_TEST_TAGS_EXTRA +internal_unwinder) >> endif() >> >> +if(CMAKE_C_FLAGS MATCHES "-march=skylake-avx512") >> + # Test verifies bitwise operations on numbers. > Maybe we should add FIXME: tag here. Fill free to ignore. Added. >> + # There is a known issue - bitop doesn't working in LuaJIT > Typo: s/working/work/ Fixed. > >> + # built with enabled AVX512 instruction set, > Typo: s/enabled/the enabled/ Fixed. > >> + # seehttps://github.com/tarantool/tarantool/issues/6787. >> + # Hence, skip this when "skylake-avx512" is passed. >> + set(LUAJIT_TEST_TAGS_EXTRA +avx512) >> +endif() >> + >> add_custom_command(TARGET LuaJIT-tests >> COMMENT "Running LuaJIT-tests" >> COMMAND >> diff --git a/test/LuaJIT-tests/lib/ffi/index b/test/LuaJIT-tests/lib/ffi/index >> index e3a34e30..e7c168c2 100644 >> --- a/test/LuaJIT-tests/lib/ffi/index >> +++ b/test/LuaJIT-tests/lib/ffi/index >> @@ -1,4 +1,4 @@ >> -bit64.lua +luajit>=2.1 >> +bit64.lua +luajit>=2.1 -avx512 > Do we need to skip all tests instead of only failed one? > >> cdata_var.lua >> copy_fill.lua >> err.lua >> diff --git a/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua > Please, use "-" instead of "_" as a separator, according to other test > names. > Also, the test name is too long. I suggest something like the > following: > > | lj-865-cross-generation-mach-o-file.test.lua > > Don't forget to update the testname below as well. Fixed. >> new file mode 100644 >> index 00000000..40bd1c74 >> --- /dev/null >> +++ b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua >> @@ -0,0 +1,343 @@ >> +local tap = require('tap') >> +local test = tap.test('lj-865-fix_generation_of_mach-o_object_files') >> + >> +test:plan(4) >> + >> +-- The test creates an object file in Mach-O format with LuaJIT >> +-- bytecode and checks the validness of the object file fields. > Typo? s/validness/validity/ > IINM, validness is the obsolete form. Fixed. >> +-- >> +-- The original problem is reproduced with LuaJIT that built with > Typo: s/LuaJIT that built/LuaJIT, which is built/ > >> +-- enabled AVX512F instructions. The support for AVX512F could be >> +-- checked in `/proc/cpuinfo` on Linux and >> +-- `sysctl hw.optional.avx512f` on Mac. AVX512F must be >> +-- implicitly enabled in a C compiler by passing CPU codename. > Typo: s/CPU codename/a CPU codename/ Fixed >> +-- Please consult for the available model architecture on >> +-- GCC Online Documentation [1] for available CPU codenames. >> +-- and Wikipedia for CPUs with AVX-512 support [2]. > I suggest the following rewording: > > | Please consult the GCC Online Documentation [1] for available CPU > | codenames. Also, see the Wikipedia for CPUs with AVX-512 support [2]. > > Or the follwing: > > | Please take a look at the GCC Online Documentation [1] for available > | CPU codenames. Also, see the Wikipedia for CPUs with AVX-512 support [2]. > > Feel free to reword it like you want. What's wrong with provided text? Why should I change it? >> +-- To detect CPU codename execute: > Typo: s/CPU codename/the CPU codename/ Fixed. > >> +-- `gcc -march=native -Q --help=target | grep march`. >> +-- >> +-- 1.https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html >> +-- 2.https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512 >> +-- >> +-- Manual steps for reproducing are the following: > I suppose you mean "from the original issue". Or we can use > | -DCMAKE_C_FLAGS="-march=skylake-avx512" These steps works in our fork too. >> +-- >> +-- $ make CC=gcc TARGET_CFLAGS='skylake-avx512' -f Makefile.original >> +-- $ echo > test.lua >> +-- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o >> +-- $ file test.o >> +-- empty.o: DOS executable (block device driver) >> + >> +local ffi = require('ffi') >> + >> +-- LuaJIT can generate so called Universal Binary with Lua bytetcode. > Comment width is more than 66 symbols. Fixed. > >> +-- The Universal Binary format is a format for executable files >> +-- that run natively on hardware platforms with different hardware >> +-- architectures. This concept is more generally known as a fat binary. >> +-- >> +-- Format of the Mach-O is described in the document > Typo: s/Format/The format/ Fixed. > >> +-- "OS X ABI Mach-O File Format Reference", published by Apple company. > Typo? s/Reference",/Reference,"/ Don't get it. Why comma should be inside quotes? > >> +-- Copy of the (now removed) official documentation can be found > Typo: s/Copy/The copy/ Fixed. > >> +-- here [1]. Yet another source of thruth are XNU headers, > Typo: s/thruth/truth/ Fixed. > > >> +-- see definition of C-structures in: [2] (`nlist_64`), > Typo: s/definition/the definition/ Fixed. > >> +-- [3] (`fat_arch` and `fat_header`). >> +-- >> +-- There are a good visual representation of Universal Binary > Typo: s/are/is/ Fixed. > >> +-- in "Mac OS X Internals" book (pages 67-68) [5] and in [6]. > Typo: s/in/in the/g Fixed. > >> +-- Below is schematic structure of Universal Binary that > Typo: s/schematic/the schematic/ > Typo: s/Binary that/Binary, which/ Fixed. > >> +-- includes two executables for PowerPC and Intel i386 (omitted): >> +-- >> +-- 0x0000000 --------------------------------------- >> +-- | >> +-- struct | 0xcafebabe FAT_MAGIC magic >> +-- fat_header | ------------------------------------- >> +-- | 0x00000003 nfat_arch >> +-- --------------------------------------- >> +-- | 0x00000012 CPU_TYPE_POWERPC cputype >> +-- | ------------------------------------- >> +-- | 0x00000000 CPU_SUBTYPE_POWERPC_ALL cpusubtype >> +-- struct | ------------------------------------- >> +-- fat_arch | 0x00001000 4096 bytes offset >> +-- | ------------------------------------- >> +-- | 0x00004224 16932 bytes size >> +-- | ------------------------------------- >> +-- | 0x0000000c 2^12 = 4096 bytes align >> +-- --------------------------------------- >> +-- --------------------------------------- >> +-- | 0x00000007 CPU_TYPE_I386 cputype >> +-- | ------------------------------------- >> +-- | 0x00000003 CPU_SUBTYPE_I386_ALL cpusubtype >> +-- struct | ------------------------------------- >> +-- fat_arch | 0x00006000 24576 bytes offset >> +-- | ------------------------------------- >> +-- | 0x0000292c 10540 bytes size >> +-- | ------------------------------------- >> +-- | 0x0000000c 2^12 = 4096 bytes align >> +-- --------------------------------------- >> +-- Unused >> +-- 0x00001000 --------------------------------------- >> +-- | 0xfeedface MH_MAGIC magic >> +-- | ------------------------------------ >> +-- | 0x00000012 CPU_TYPE_POWERPC cputype >> +-- | ------------------------------------ >> +-- struct | 0x00000000 CPU_SUBTYPE_POWERPC_ALL cpusubtype >> +-- mach_header | ------------------------------------ >> +-- | 0x00000002 MH_EXECUTE filetype >> +-- | ------------------------------------ >> +-- | 0x0000000b 10 load commands ncmds >> +-- | ------------------------------------ >> +-- | 0x00000574 1396 bytes sizeofcmds >> +-- | ------------------------------------ >> +-- | 0x00000085 DYLDLINK TWOLEVEL flags >> +-- -------------------------------------- >> +-- Load commands >> +-- --------------------------------------- >> +-- Data >> +-- --------------------------------------- > Side note: This diagramm is amazing thanks! > >> +-- >> +-- < x86 executable > >> +-- >> +-- > Nit: excess empty line. Removed. >> +-- 1.https://github.com/aidansteele/osx-abi-macho-file-format-reference >> +-- 2.https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h >> +-- 3.https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h >> +-- 4.https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code >> +-- 5.https://reverseengineering.stackexchange.com/a/6357/46029 >> +-- 6.http://formats.kaitai.io/mach_o/index.html >> +-- >> +-- Using the same declarations as defined in . >> +ffi.cdef[[ >> +typedef struct >> +{ >> + uint32_t magic, cputype, cpusubtype, filetype, ncmds, sizeofcmds, flags; >> +} mach_header; >> + >> +typedef struct >> +{ >> + mach_header; uint32_t reserved; >> +} mach_header_64; >> + >> +typedef struct { >> + uint32_t cmd, cmdsize; >> + char segname[16]; >> + uint32_t vmaddr, vmsize, fileoff, filesize; >> + uint32_t maxprot, initprot, nsects, flags; >> +} mach_segment_command; >> + >> +typedef struct { >> + uint32_t cmd, cmdsize; >> + char segname[16]; >> + uint64_t vmaddr, vmsize, fileoff, filesize; >> + uint32_t maxprot, initprot, nsects, flags; >> +} mach_segment_command_64; >> + >> +typedef struct { >> + char sectname[16], segname[16]; >> + uint32_t addr, size; >> + uint32_t offset, align, reloff, nreloc, flags; >> + uint32_t reserved1, reserved2; >> +} mach_section; >> + >> +typedef struct { >> + char sectname[16], segname[16]; >> + uint64_t addr, size; >> + uint32_t offset, align, reloff, nreloc, flags; >> + uint32_t reserved1, reserved2, reserved3; >> +} mach_section_64; >> + >> +typedef struct { >> + uint32_t cmd, cmdsize, symoff, nsyms, stroff, strsize; >> +} mach_symtab_command; >> + >> +typedef struct { >> + int32_t strx; >> + uint8_t type, sect; >> + int16_t desc; >> + uint32_t value; >> +} mach_nlist; >> + >> +typedef struct { >> + uint32_t strx; > Should we use here `int32_t` as it is after the changes? Yes, fixed. > >> + uint8_t type, sect; >> + uint16_t desc; >> + uint64_t value; >> +} mach_nlist_64; >> + >> +typedef struct >> +{ >> + uint32_t magic, nfat_arch; > Should we use here `int32_t` as it is after the changes? Fixed. > >> +} mach_fat_header; >> + >> +typedef struct >> +{ >> + uint32_t cputype, cpusubtype, offset, size, align; > Should we use here `int32_t` as it is after the changes? Fixed. > >> +} mach_fat_arch; >> + >> +typedef struct { >> + mach_fat_header fat; >> + mach_fat_arch fat_arch[2]; >> + struct { >> + mach_header hdr; >> + mach_segment_command seg; >> + mach_section sec; >> + mach_symtab_command sym; >> + } arch[2]; >> + mach_nlist sym_entry; >> + uint8_t space[4096]; >> +} mach_fat_obj; >> + >> +typedef struct { >> + mach_fat_header fat; >> + mach_fat_arch fat_arch[2]; >> + struct { >> + mach_header_64 hdr; >> + mach_segment_command_64 seg; >> + mach_section_64 sec; >> + mach_symtab_command sym; >> + } arch[2]; >> + mach_nlist_64 sym_entry; >> + uint8_t space[4096]; >> +} mach_fat_obj_64; > Should this part be defined in the next patch, since there is no such > cdef at the moment? > > OTOH, it makes test more arch-agnostic. So, the adding of the other case > is more simple. Moved to the third patch. >> +]] >> + >> +local function create_obj_file(name, arch) >> + local mach_o_path = os.tmpname() .. '.o' >> + local lua_path = os.getenv('LUA_PATH') >> + local lua_bin = require('utils').exec.luacmd(arg):match('%S+') >> + local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -o osx -a %s -e "print()" %s' >> + local cmd = (cmd_fmt):format(lua_path, lua_bin, name, arch, mach_o_path) >> + local ret = os.execute(cmd) >> + assert(ret == 0, 'cannot create an object file') >> + return mach_o_path >> +end >> + >> +-- Parses a buffer in the Mach-O format and returns >> +-- the FAT magic number and `nfat_arch`. >> +local function read_mach_o(buf, hw_arch) >> + local res = { >> + header = { >> + magic = 0, >> + nfat_arch = 0, >> + }, >> + fat_arch = {}, >> + } >> + >> + local is64 = hw_arch == 'arm64' >> + >> + -- Mach-O FAT object. >> + local mach_fat_obj_type = ffi.typeof(is64 and >> + 'mach_fat_obj_64 *' or 'mach_fat_obj *') >> + local obj = ffi.cast(mach_fat_obj_type, buf) >> + >> + -- Mach-O FAT object header. >> + local mach_fat_header_type = ffi.typeof('mach_fat_header *') >> + local mach_fat_header = ffi.cast(mach_fat_header_type, obj.fat) > Looks like this cast is excess: > | src/luajit -e ' > | local ffi = require"ffi" > | ffi.cdef[[ > | struct test {int a; int b;}; > | struct test_outer { struct test t; int c;}; > | ]] > | local s = ffi.new("struct test_outer", {{0}, 0}) > | print(s.t.a) > | ' > | 0 Code in the test casts cdata to ffi type, you snippet demonstrate how to create cdata using ffi.new. It is similar cases? > >> + local be32 = bit.bswap -- Mach-O FAT is BE, target arch is LE. > Please, use the separate line for the comment. Moved to a new line. > >> + res.header.magic = be32(mach_fat_header.magic) >> + res.header.nfat_arch = be32(mach_fat_header.nfat_arch) >> + >> + -- Mach-O FAT object arches. >> + local mach_fat_arch_type = ffi.typeof('mach_fat_arch *') >> + for i = 0, res.header.nfat_arch - 1 do >> + local fat_arch = ffi.cast(mach_fat_arch_type, obj.fat_arch[i]) > Looks like this cast is excess. > >> + local arch = { >> + cputype = be32(fat_arch.cputype), >> + cpusubtype = be32(fat_arch.cpusubtype), >> + } >> + table.insert(res.fat_arch, arch) >> + end >> + >> + return res >> +end >> + >> +-- Universal Binary can contain executables for more than one >> +-- CPU architecture. For simplicity the test compares > Typo: s/For simplicity/For simplicity,/ Fixed. > Nit: The comment line is "underfilled". Fixed. > >> +-- sum of CPU types and CPU subtypes. > Typo: s/sum/the sum/ Fixed. > >> +-- Numbers below are defined in . > Typo: s/Numbers/The numbers/ There is no space for "" on the line after adding "The"as you requested. And line become underfileld if "" moved to the next line. How do it better: leave a free space on the line or left "Number" without article? > >> +-- Original source is XNU source code, [1]. > Typo: s/Original/The original/ > Typo: s/XNU/the XNU/ the same as above, with added articles line looks underfilled and therefore it looks ugly. > >> +-- >> +-- 1.https://opensource.apple.com/source/xnu/xnu-4570.41.2/osfmk/mach/machine.h.auto.html >> +local sum_cputype = { >> + x86 = 7, >> + x64 = 0x01000007, >> + arm = 7 + 12, >> + arm64 = 0x01000007 + 0x0100000c, >> +} >> +local sum_cpusubtype = { >> + x86 = 3, >> + x64 = 3, >> + arm = 3 + 9, >> + arm64 = 3 + 0, >> +} > I suggest the following for the clarification: > > | local cputype = { > | x86 = 7, > | x64 = 0x01000007, > | arm = 7, > | arm64 = 0x0100000c, > | } > | local cpusubtype = { > | x86 = 3, > | x64 = 3, > | arm = 9, > | arm64 = 0, > | } > > And declare next: > > | local sum_cputype = { > | arm = cputype.arm + cputype.x86, > | arm64 = cputype.arm64 + cputype.x64, > | } > | local sum_cpusubtype = { > | arm = cpusubtype.arm + cpusubtype.x86, > | arm64 = cpusubtype.arm64 + cpusubtype.x64, > | } You proposed changes breaks the test. See original dictionaries https://github.com/tarantool/luajit/blob/ecc45358ddab52c2d0100c74da8efddef2aaf365/src/jit/bcsave.lua#L512-L513   local cputype = ({ x86={7}, x64={0x01000007}, arm={7,12}, arm64={0x01000007,0x0100000c} })[ctx.arch] converted to sum_cputype = { x86 = 7, x64 = 0x01000007, arm = 7 + 12, arm64 = 0x01000007 + 0x0100000c}   local cpusubtype = ({ x86={3}, x64={3}, arm={3,9}, arm64={3,0} })[ctx.arch] converted to sum_cpusubtype = { x86 = 3, x64 = 3, arm = 3 + 9, arm64 = 3 + 0 } I don't know how to make this place in the test more clear. See comment that explains these arrays:   -- Universal Binary can contain executables for more than one   -- CPU architecture. For simplicity, the test compares the sum of   -- CPU types and CPU subtypes.   -- Numbers below are defined in .   -- The original source is the XNU source code,   -- [1].   -- Ignored. > > Also, it is preferable to use upper case since values are constants. Made everything uppercase. > >> + >> +-- The function builds Mach-O FAT object file and retrieves >> +-- its header fields (magic and nfat_arch) >> +-- and fields of the each arch (cputype, cpusubtype). > Typo: s/the each/each/ Fixed. Also line above was underfilled, fixed that. > >> +-- >> +-- Mach-O FAT object header can be retrieved with `otool` on macOS: > Typo: s/Mach-O FAT object/A Mach-O FAT object/ Fixed to "The Mach-O FAT object". > Typo: s/header/headers/ single header, why "s"? >> +-- >> +-- $ otool -f empty.o >> +-- Fat headers >> +-- fat_magic 0xcafebabe >> +-- nfat_arch 2 >> +-- > Minor: it is better to separate quoting in the comment like the > following to simplify reading: > > | $ otool -f empty.o > | Fat headers > | fat_magic 0xcafebabe > | nfat_arch 2 > | ... > > Feel free to ignore. Ignored. > >> +-- >> +-- CPU type and subtype can be retrieved with `lipo` on macOS: >> +-- >> +-- $ luajit -b -o osx -a arm empty.lua empty.o >> +-- $ lipo -archs empty.o >> +-- i386 armv7 >> +-- $ luajit -b -o osx -a arm64 empty.lua empty.o >> +-- $ lipo -archs empty.o >> +-- x86_64 arm64 >> +local function build_and_check_mach_o(hw_arch) > Maybe it is better to use a subtest here: > The number of tests is fixed for this function, and it becomes more > clear when we add the additional arch for testing. > >> + assert(hw_arch == 'arm' or hw_arch == 'arm64') >> + >> + -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in >> + -- big-endian byte order format. On a big-endian host CPU, >> + -- this can be validated using the constant FAT_MAGIC; >> + -- on a little-endian host CPU, it can be validated using >> + -- the constant FAT_CIGAM. >> + -- >> + -- FAT_NARCH is an integer specifying the number of fat_arch >> + -- data structures that follow. This is the number of >> + -- architectures contained in this binary. >> + -- >> + -- See aforementioned "OS X ABI Mach-O File Format Reference". > Typo: s/aforementioned/the aforementioned/ With added article I need to move name of the reference to a next line, and line looks under-filled and ugly. >> + -- > Excess empty line. It is a visual splitter. > >> + local FAT_MAGIC = '0xffffffffcafebabe' >> + local FAT_NARCH = 2 >> + >> + local MODULE_NAME = 'lango_team' >> + >> + local mach_o_obj_path = create_obj_file(MODULE_NAME, hw_arch) >> + local mach_o_buf = require('utils').tools.read_file(mach_o_obj_path) > Please move `require` of the necessary tool to the start of the file to > be consistent with the code style in tests. moved. I found that 20 tests in our repository are not consistent too: test/tarantool-tests/lj-946-print-errors-from-gc-fin-default.test.lua:local script = require('utils').exec.makecmd(arg, { redirect = '2>&1' }) test/tarantool-tests/unit-jit-parse.test.lua:local jparse = require('utils').jit.parse test/tarantool-tests/lj-981-folding-0.test.lua:local jparse = require('utils').jit.parse test/tarantool-tests/lj-366-strtab-correct-size.test.lua:  local lua_bin = require('utils').exec.luacmd(arg):match('%S+') test/tarantool-tests/lj-366-strtab-correct-size.test.lua:local elf_content = require('utils').tools.read_file(elf_filename) test/tarantool-tests/fix-argv2ctype-cts-L-init.test.lua:local script = require('utils').exec.makecmd(arg) grep: test/tarantool-tests/.lj-736-BC_UCLO-triggers-infinite-loop.lua.swp: binary file matches test/tarantool-tests/fix-mips64-spare-side-exit-patching.test.lua:local generators = require('utils').jit.generators test/tarantool-tests/fix-mips64-spare-side-exit-patching.test.lua:local frontend = require('utils').frontend test/tarantool-tests/lj-1128-table-new-opt-tnew.test.lua:local jparse = require('utils').jit.parse test/tarantool-tests/gh-6098-fix-side-exit-patching-on-arm64.test.lua:local generators = require('utils').jit.generators test/tarantool-tests/gh-6098-fix-side-exit-patching-on-arm64.test.lua:local frontend = require('utils').frontend test/tarantool-tests/lj-351-print-tostring-number.test.lua:local script = require('utils').exec.makecmd(arg) test/tarantool-tests/lj-flush-on-trace.test.lua:local script = require('utils').exec.makecmd(arg, { test/tarantool-tests/lj-366-strtab-correct-size.test.lua_:  local lua_bin = require('utils').exec.luacmd(arg):match('%S+') test/tarantool-tests/lj-366-strtab-correct-size.test.lua_:local elf_content = require('utils').tools.read_file(elf_filename) test/tarantool-tests/fix-gc-setupvalue.test.lua:local gcisblack = require('utils').gc.isblack test/tarantool-tests/misclib-getmetrics-lapi.test.lua:local MAXNINS = require('utils').jit.const.maxnins test/tarantool-tests/gh-4427-ffi-sandwich.test.lua:local script = require('utils').exec.makecmd(arg, { test/tarantool-tests/fix-jit-dump-ir-conv.test.lua:local jparse = require('utils').jit.parse test/tarantool-tests/lj-802-panic-at-mcode-protfail.test.lua:local script = require('utils').exec.makecmd(arg, { test/tarantool-tests/lj-962-stack-overflow-report.test.lua:local LUABIN = require('utils').exec.luabin(arg) > >> + assert(mach_o_buf == nil or #mach_o_buf ~= 0, 'cannot read an object file') > Should it be: > | mach_o_buf ~= nil and #mach_o_buf ~= 0 Fixed to " assert(mach_o_buf ~= nil and #mach_o_buf ~= 0, 'cannot read an object file')". > >> + >> + local mach_o = read_mach_o(mach_o_buf, hw_arch) >> + >> + -- Teardown. >> + local retcode = os.remove(mach_o_obj_path) >> + assert(retcode == true, 'remove an object file') > Minor: I suggest refactoring this as the following (`retcode` isn't used > anywhere else): > > | assert(os.remove(mach_o_obj_path), 'remove an object file') Updated. > >> + >> + local magic_str = string.format('0x%02x', mach_o.header.magic) > Why do we need 02 in the format string? > Also, you may use '%#x' for the same result. Fixed. > >> + test:is(magic_str, FAT_MAGIC, >> + 'fat_magic is correct in Mach-O, ' .. hw_arch) >> + test:is(mach_o.header.nfat_arch, FAT_NARCH, >> + 'nfat_arch is correct in Mach-O, ' .. hw_arch) >> + >> + local total_cputype = 0 >> + local total_cpusubtype = 0 >> + for i = 1, mach_o.header.nfat_arch do > Minor: I suggest using `FAT_NARCH` here since we've checked their > equality above. Updated. > >> + total_cputype = total_cputype + mach_o.fat_arch[i].cputype >> + total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype >> + end >> + test:is(total_cputype, sum_cputype[hw_arch], >> + 'cputype is correct in Mach-O, ' .. hw_arch) >> + test:is(total_cpusubtype, sum_cpusubtype[hw_arch], >> + 'cpusubtype is correct in Mach-O, ' .. hw_arch) >> +end >> + >> +build_and_check_mach_o('arm') >> + >> +test:done(true) >> -- >> 2.44.0 --------------SUPZZHcVC6xhCPA290klyD7m Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

Hi, Sergey

thanks for review! Please see my comment below.


On 08.04.2024 18:01, Sergey Kaplun wrote:
Hi, Sergey!
Thanks for the patch!
I'll proceed with the version from the branch.

Please consider my comments below.

Please rebase your branch to the tarantool/master and solve the
conflicts.

On 26.03.24, Sergey Bronnikov wrote:
Subject: [PATCH 2/3] OSX/iOS/ARM64: Fix generation of Mach-O object files.

Thanks to Carlo Cabrera.

(cherry picked from commit 3065c910ad6027031aabe2dfd3c26a3d0f014b4f)

<snipped>

Part of tarantool/tarantool#9595
---
 .github/workflows/exotic-builds-testing.yml   |   6 +-
 src/jit/bcsave.lua                            |   6 +-
 test/LuaJIT-tests/CMakeLists.txt              |   9 +
 test/LuaJIT-tests/lib/ffi/index               |   2 +-
 ...generation_of_mach-o_object_files.test.lua | 343 ++++++++++++++++++
 5 files changed, 361 insertions(+), 5 deletions(-)
 create mode 100644 test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua

diff --git a/.github/workflows/exotic-builds-testing.yml b/.github/workflows/exotic-builds-testing.yml
index 859603bd..9cace0bc 100644
--- a/.github/workflows/exotic-builds-testing.yml
+++ b/.github/workflows/exotic-builds-testing.yml
@@ -34,7 +34,7 @@ jobs:
         BUILDTYPE: [Debug, Release]
         ARCH: [ARM64, x86_64]
         GC64: [ON, OFF]
-        FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind]
+        FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind, avx512]
Please sort entries alphabetically.

Done.


--- a/.github/workflows/exotic-builds-testing.yml
+++ b/.github/workflows/exotic-builds-testing.yml
@@ -34,7 +34,7 @@ jobs:
         BUILDTYPE: [Debug, Release]
         ARCH: [ARM64, x86_64]
         GC64: [ON, OFF]
-        FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind, avx512]
+        FLAVOR: [ avx512, checkhook, dualnum, gdbjit, nojit, nounwind ]
         include:
           - BUILDTYPE: Debug
             CMAKEFLAGS: -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON



      
         include:
           - BUILDTYPE: Debug
             CMAKEFLAGS: -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON
@@ -50,12 +50,16 @@ jobs:
             FLAVORFLAGS: -DLUAJIT_USE_GDBJIT=ON
           - FLAVOR: nounwind
             FLAVORFLAGS: -DLUAJIT_NO_UNWIND=ON
+          - FLAVOR: avx512
+            CMAKEFLAGS: -DCMAKE_C_FLAGS=-march=skylake-avx512 -DCMAKE_C_COMPILER=gcc
Is there any guarantee that the given runner has avx512 support?
If not, our tests will fail due to illegal instruction error.

I'll add a check for AVX support and condition that will skip job if there is no AVX support.


         exclude:
           - ARCH: ARM64
             GC64: OFF
           # DUALNUM is default for ARM64, no need for additional testing.
           - FLAVOR: dualnum
             ARCH: ARM64
+          - FLAVOR: avx512
+            ARCH: ARM64
Please sort entries alphabetically.

Fixed:

--- a/.github/workflows/exotic-builds-testing.yml
+++ b/.github/workflows/exotic-builds-testing.yml
@@ -34,12 +34,14 @@ jobs:
         BUILDTYPE: [Debug, Release]
         ARCH: [ARM64, x86_64]
         GC64: [ON, OFF]
-        FLAVOR: [checkhook, dualnum, gdbjit, nojit, nounwind, avx512]
+        FLAVOR: [ avx512, checkhook, dualnum, gdbjit, nojit, nounwind ]
         include:
           - BUILDTYPE: Debug
             CMAKEFLAGS: -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_ASSERT=ON -DLUA_USE_APICHECK=ON
           - BUILDTYPE: Release
             CMAKEFLAGS: -DCMAKE_BUILD_TYPE=RelWithDebInfo
+          - FLAVOR: avx512
+            CMAKEFLAGS: -DCMAKE_C_FLAGS=-march=skylake-avx512 -DCMAKE_C_COMPILER=gcc
           - FLAVOR: dualnum
             FLAVORFLAGS: -DLUAJIT_NUMMODE=2
           - FLAVOR: checkhook


    

      
     runs-on: [self-hosted, regular, Linux, '${{ matrix.ARCH }}']
     name: >
       LuaJIT ${{ matrix.FLAVOR }}
diff --git a/src/jit/bcsave.lua b/src/jit/bcsave.lua
index a287d675..7aec1555 100644
--- a/src/jit/bcsave.lua
+++ b/src/jit/bcsave.lua
@@ -446,18 +446,18 @@ typedef struct {
   uint32_t value;
 } mach_nlist;
 typedef struct {
-  uint32_t strx;
+  int32_t strx;
   uint8_t type, sect;
   uint16_t desc;
   uint64_t value;
 } mach_nlist_64;
 typedef struct
 {
-  uint32_t magic, nfat_arch;
+  int32_t magic, nfat_arch;
 } mach_fat_header;
 typedef struct
 {
-  uint32_t cputype, cpusubtype, offset, size, align;
+  int32_t cputype, cpusubtype, offset, size, align;
 } mach_fat_arch;
 typedef struct {
   struct {
diff --git a/test/LuaJIT-tests/CMakeLists.txt b/test/LuaJIT-tests/CMakeLists.txt
index a0fb5440..29146113 100644
--- a/test/LuaJIT-tests/CMakeLists.txt
+++ b/test/LuaJIT-tests/CMakeLists.txt
@@ -64,6 +64,15 @@ if(LUAJIT_NO_UNWIND)
   set(LUAJIT_TEST_TAGS_EXTRA +internal_unwinder)
 endif()
 
+if(CMAKE_C_FLAGS MATCHES "-march=skylake-avx512")
+  # Test <bit64.lua> verifies bitwise operations on numbers.
Maybe we should add FIXME: tag here. Fill free to ignore.

Added.



      
+  # There is a known issue - bitop doesn't working in LuaJIT
Typo: s/working/work/

Fixed.



+  # built with enabled AVX512 instruction set,
Typo: s/enabled/the enabled/

Fixed.



+  # see https://github.com/tarantool/tarantool/issues/6787.
+  # Hence, skip this when "skylake-avx512" is passed.
+  set(LUAJIT_TEST_TAGS_EXTRA +avx512)
+endif()
+
 add_custom_command(TARGET LuaJIT-tests
   COMMENT "Running LuaJIT-tests"
   COMMAND
diff --git a/test/LuaJIT-tests/lib/ffi/index b/test/LuaJIT-tests/lib/ffi/index
index e3a34e30..e7c168c2 100644
--- a/test/LuaJIT-tests/lib/ffi/index
+++ b/test/LuaJIT-tests/lib/ffi/index
@@ -1,4 +1,4 @@
-bit64.lua +luajit>=2.1
+bit64.lua +luajit>=2.1 -avx512
Do we need to skip all tests instead of only failed one?

 cdata_var.lua
 copy_fill.lua
 err.lua
diff --git a/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
Please, use "-" instead of "_" as a separator, according to other test
names.
Also, the test name is too long. I suggest something like the
following:

| lj-865-cross-generation-mach-o-file.test.lua

Don't forget to update the testname below as well.

Fixed.



      
new file mode 100644
index 00000000..40bd1c74
--- /dev/null
+++ b/test/tarantool-tests/lj-865-fix_generation_of_mach-o_object_files.test.lua
@@ -0,0 +1,343 @@
+local tap = require('tap')
+local test = tap.test('lj-865-fix_generation_of_mach-o_object_files')
+
+test:plan(4)
+
+-- The test creates an object file in Mach-O format with LuaJIT
+-- bytecode and checks the validness of the object file fields.
Typo? s/validness/validity/
IINM, validness is the obsolete form.

Fixed.



      
+--
+-- The original problem is reproduced with LuaJIT that built with
Typo: s/LuaJIT that built/LuaJIT, which is built/

+-- enabled AVX512F instructions. The support for AVX512F could be
+-- checked in `/proc/cpuinfo` on Linux and
+-- `sysctl hw.optional.avx512f` on Mac. AVX512F must be
+-- implicitly enabled in a C compiler by passing CPU codename.
Typo: s/CPU codename/a CPU codename/
Fixed

      
+-- Please consult for the available model architecture on
+-- GCC Online Documentation [1] for available CPU codenames.
+-- and Wikipedia for CPUs with AVX-512 support [2].
I suggest the following rewording:

| Please consult the GCC Online Documentation [1] for available CPU
| codenames. Also, see the Wikipedia for CPUs with AVX-512 support [2].

Or the follwing:

| Please take a look at the GCC Online Documentation [1] for available
| CPU codenames. Also, see the Wikipedia for CPUs with AVX-512 support [2].

Feel free to reword it like you want.

What's wrong with provided text?

Why should I change it?

+-- To detect CPU codename execute:
Typo: s/CPU codename/the CPU codename/

Fixed.



+-- `gcc -march=native -Q --help=target | grep march`.
+--
+-- 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
+-- 2. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512
+--
+-- Manual steps for reproducing are the following:
I suppose you mean "from the original issue". Or we can use
| -DCMAKE_C_FLAGS="-march=skylake-avx512"
These steps works in our fork too.

      
+--
+-- $ make CC=gcc TARGET_CFLAGS='skylake-avx512' -f Makefile.original
+-- $ echo > test.lua
+-- $ LUA_PATH="src/?.lua;;" luajit -b -o osx -a arm test.lua test.o
+-- $ file test.o
+-- empty.o: DOS executable (block device driver)
+
+local ffi = require('ffi')
+
+-- LuaJIT can generate so called Universal Binary with Lua bytetcode.
Comment width is more than 66 symbols.
Fixed.

+-- The Universal Binary format is a format for executable files
+-- that run natively on hardware platforms with different hardware
+-- architectures. This concept is more generally known as a fat binary.
+--
+-- Format of the Mach-O is described in the document
Typo: s/Format/The format/
Fixed.

+-- "OS X ABI Mach-O File Format Reference", published by Apple company.
Typo? s/Reference",/Reference,"/
Don't get it. Why comma should be inside quotes?

+-- Copy of the (now removed) official documentation can be found
Typo: s/Copy/The copy/
Fixed.

+-- here [1]. Yet another source of thruth are XNU headers,
Typo: s/thruth/truth/

Fixed.




+-- see definition of C-structures in: [2] (`nlist_64`),
Typo: s/definition/the definition/
Fixed.

+-- [3] (`fat_arch` and `fat_header`).
+--
+-- There are a good visual representation of Universal Binary
Typo: s/are/is/

Fixed.



+-- in "Mac OS X Internals" book (pages 67-68) [5] and in [6].
Typo: s/in/in the/g

Fixed.



+-- Below is schematic structure of Universal Binary that
Typo: s/schematic/the schematic/
Typo: s/Binary that/Binary, which/


Fixed.


+-- includes two executables for PowerPC and Intel i386 (omitted):
+--
+--   0x0000000 ---------------------------------------
+--             |
+-- struct      | 0xcafebabe  FAT_MAGIC                 magic
+-- fat_header  | -------------------------------------
+--             | 0x00000003                            nfat_arch
+--             ---------------------------------------
+--             | 0x00000012  CPU_TYPE_POWERPC          cputype
+--             | -------------------------------------
+--             | 0x00000000  CPU_SUBTYPE_POWERPC_ALL   cpusubtype
+-- struct      | -------------------------------------
+-- fat_arch    | 0x00001000  4096 bytes                offset
+--             | -------------------------------------
+--             | 0x00004224  16932 bytes               size
+--             | -------------------------------------
+--             | 0x0000000c  2^12 = 4096 bytes         align
+--             ---------------------------------------
+--             ---------------------------------------
+--             | 0x00000007  CPU_TYPE_I386             cputype
+--             | -------------------------------------
+--             | 0x00000003  CPU_SUBTYPE_I386_ALL      cpusubtype
+-- struct      | -------------------------------------
+-- fat_arch    | 0x00006000  24576 bytes               offset
+--             | -------------------------------------
+--             | 0x0000292c  10540 bytes               size
+--             | -------------------------------------
+--             | 0x0000000c  2^12 = 4096 bytes         align
+--             ---------------------------------------
+--               Unused
+-- 0x00001000  ---------------------------------------
+--             | 0xfeedface  MH_MAGIC                  magic
+--             | ------------------------------------
+--             | 0x00000012  CPU_TYPE_POWERPC          cputype
+--             | ------------------------------------
+-- struct      | 0x00000000  CPU_SUBTYPE_POWERPC_ALL   cpusubtype
+-- mach_header | ------------------------------------
+--             | 0x00000002  MH_EXECUTE                filetype
+--             | ------------------------------------
+--             | 0x0000000b  10 load commands          ncmds
+--             | ------------------------------------
+--             | 0x00000574  1396 bytes                sizeofcmds
+--             | ------------------------------------
+--             | 0x00000085  DYLDLINK TWOLEVEL         flags
+--             --------------------------------------
+--               Load commands
+--             ---------------------------------------
+--               Data
+--             ---------------------------------------
Side note: This diagramm is amazing thanks!

+--
+--               < x86 executable >
+--
+--
Nit: excess empty line.

Removed.



      
+-- 1. https://github.com/aidansteele/osx-abi-macho-file-format-reference
+-- 2. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/nlist.h
+-- 3. https://github.com/apple-oss-distributions/xnu/blob/xnu-10002.1.13/EXTERNAL_HEADERS/mach-o/fat.h
+-- 4. https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code
+-- 5. https://reverseengineering.stackexchange.com/a/6357/46029
+-- 6. http://formats.kaitai.io/mach_o/index.html
+--
+-- Using the same declarations as defined in <src/jit/bcsave.lua>.
+ffi.cdef[[
+typedef struct
+{
+  uint32_t magic, cputype, cpusubtype, filetype, ncmds, sizeofcmds, flags;
+} mach_header;
+
+typedef struct
+{
+  mach_header; uint32_t reserved;
+} mach_header_64;
+
+typedef struct {
+  uint32_t cmd, cmdsize;
+  char segname[16];
+  uint32_t vmaddr, vmsize, fileoff, filesize;
+  uint32_t maxprot, initprot, nsects, flags;
+} mach_segment_command;
+
+typedef struct {
+  uint32_t cmd, cmdsize;
+  char segname[16];
+  uint64_t vmaddr, vmsize, fileoff, filesize;
+  uint32_t maxprot, initprot, nsects, flags;
+} mach_segment_command_64;
+
+typedef struct {
+  char sectname[16], segname[16];
+  uint32_t addr, size;
+  uint32_t offset, align, reloff, nreloc, flags;
+  uint32_t reserved1, reserved2;
+} mach_section;
+
+typedef struct {
+  char sectname[16], segname[16];
+  uint64_t addr, size;
+  uint32_t offset, align, reloff, nreloc, flags;
+  uint32_t reserved1, reserved2, reserved3;
+} mach_section_64;
+
+typedef struct {
+  uint32_t cmd, cmdsize, symoff, nsyms, stroff, strsize;
+} mach_symtab_command;
+
+typedef struct {
+  int32_t strx;
+  uint8_t type, sect;
+  int16_t desc;
+  uint32_t value;
+} mach_nlist;
+
+typedef struct {
+  uint32_t strx;
Should we use here `int32_t` as it is after the changes?
Yes, fixed.

+  uint8_t type, sect;
+  uint16_t desc;
+  uint64_t value;
+} mach_nlist_64;
+
+typedef struct
+{
+  uint32_t magic, nfat_arch;
Should we use here `int32_t` as it is after the changes?
Fixed.

+} mach_fat_header;
+
+typedef struct
+{
+  uint32_t cputype, cpusubtype, offset, size, align;

Should we use here `int32_t` as it is after the changes?

Fixed.



+} mach_fat_arch;
+
+typedef struct {
+  mach_fat_header fat;
+  mach_fat_arch fat_arch[2];
+  struct {
+    mach_header hdr;
+    mach_segment_command seg;
+    mach_section sec;
+    mach_symtab_command sym;
+  } arch[2];
+  mach_nlist sym_entry;
+  uint8_t space[4096];
+} mach_fat_obj;
+
+typedef struct {
+  mach_fat_header fat;
+  mach_fat_arch fat_arch[2];
+  struct {
+    mach_header_64 hdr;
+    mach_segment_command_64 seg;
+    mach_section_64 sec;
+    mach_symtab_command sym;
+  } arch[2];
+  mach_nlist_64 sym_entry;
+  uint8_t space[4096];
+} mach_fat_obj_64;
Should this part be defined in the next patch, since there is no such
cdef at the moment?

OTOH, it makes test more arch-agnostic. So, the adding of the other case
is more simple.
Moved to the third patch.

      
+]]
+
+local function create_obj_file(name, arch)
+  local mach_o_path = os.tmpname() .. '.o'
+  local lua_path = os.getenv('LUA_PATH')
+  local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
+  local cmd_fmt = 'LUA_PATH="%s" %s -b -n "%s" -o osx -a %s -e "print()" %s'
+  local cmd = (cmd_fmt):format(lua_path, lua_bin, name, arch, mach_o_path)
+  local ret = os.execute(cmd)
+  assert(ret == 0, 'cannot create an object file')
+  return mach_o_path
+end
+
+-- Parses a buffer in the Mach-O format and returns
+-- the FAT magic number and `nfat_arch`.
+local function read_mach_o(buf, hw_arch)
+  local res = {
+    header = {
+      magic = 0,
+      nfat_arch = 0,
+    },
+    fat_arch = {},
+  }
+
+  local is64 = hw_arch == 'arm64'
+
+  -- Mach-O FAT object.
+  local mach_fat_obj_type = ffi.typeof(is64 and
+                                       'mach_fat_obj_64 *' or 'mach_fat_obj *')
+  local obj = ffi.cast(mach_fat_obj_type, buf)
+
+  -- Mach-O FAT object header.
+  local mach_fat_header_type = ffi.typeof('mach_fat_header *')
+  local mach_fat_header = ffi.cast(mach_fat_header_type, obj.fat)
Looks like this cast is excess:
| src/luajit -e '
| local ffi = require"ffi"
| ffi.cdef[[
| struct test {int a; int b;};
| struct test_outer { struct test t; int c;};
| ]]
| local s = ffi.new("struct test_outer", {{0}, 0})
| print(s.t.a)
| '
| 0

Code in the test casts cdata to ffi type, you snippet demonstrate how to create cdata using ffi.new.

It is similar cases?


+  local be32 = bit.bswap -- Mach-O FAT is BE, target arch is LE.
Please, use the separate line for the comment.
Moved to a new line.

+  res.header.magic = be32(mach_fat_header.magic)
+  res.header.nfat_arch = be32(mach_fat_header.nfat_arch)
+
+  -- Mach-O FAT object arches.
+  local mach_fat_arch_type = ffi.typeof('mach_fat_arch *')
+  for i = 0, res.header.nfat_arch - 1 do
+    local fat_arch = ffi.cast(mach_fat_arch_type, obj.fat_arch[i])
Looks like this cast is excess.

+    local arch = {
+      cputype = be32(fat_arch.cputype),
+      cpusubtype = be32(fat_arch.cpusubtype),
+    }
+    table.insert(res.fat_arch, arch)
+  end
+
+  return res
+end
+
+-- Universal Binary can contain executables for more than one
+-- CPU architecture. For simplicity the test compares
Typo: s/For simplicity/For simplicity,/
Fixed.
Nit: The comment line is "underfilled".
Fixed.

+-- sum of CPU types and CPU subtypes.
Typo: s/sum/the sum/
Fixed.

+-- Numbers below are defined in <src/jit/bcsave.lua:bcsave_machobj>.
Typo: s/Numbers/The numbers/

There is no space for "<src/jit/bcsave.lua:bcsave_machob>" on the line

after adding "The"as you requested. And line become underfileld if

"<src/jit/bcsave.lua:bcsave_machob>" moved to the next line. How do it better:

leave a free space on the line or left "Number" without article?


+-- Original source is XNU source code, <osfmk/mach/machine.h> [1].
Typo: s/Original/The original/
Typo: s/XNU/the XNU/
the same as above, with added articles line looks underfilled and therefore it looks ugly.

+--
+-- 1. https://opensource.apple.com/source/xnu/xnu-4570.41.2/osfmk/mach/machine.h.auto.html
+local sum_cputype = {
+  x86 = 7,
+  x64 = 0x01000007,
+  arm = 7 + 12,
+  arm64 = 0x01000007 + 0x0100000c,
+}
+local sum_cpusubtype = {
+  x86 = 3,
+  x64 = 3,
+  arm = 3 + 9,
+  arm64 = 3 + 0,
+}
I suggest the following for the clarification:

| local cputype = {
|   x86 = 7,
|   x64 = 0x01000007,
|   arm = 7,
|   arm64 = 0x0100000c,
| }
| local cpusubtype = {
|   x86 = 3,
|   x64 = 3,
|   arm = 9,
|   arm64 = 0,
| }

And declare next:

| local sum_cputype = {
|   arm = cputype.arm + cputype.x86,
|   arm64 = cputype.arm64 + cputype.x64,
| }
| local sum_cpusubtype = {
|   arm = cpusubtype.arm + cpusubtype.x86,
|   arm64 = cpusubtype.arm64 + cpusubtype.x64,
| }

You proposed changes breaks the test.

See original dictionaries

https://github.com/tarantool/luajit/blob/ecc45358ddab52c2d0100c74da8efddef2aaf365/src/jit/bcsave.lua#L512-L513

  local cputype = ({ x86={7}, x64={0x01000007}, arm={7,12}, arm64={0x01000007,0x0100000c} })[ctx.arch]


converted to sum_cputype = { x86 = 7, x64 = 0x01000007, arm = 7 + 12, arm64 = 0x01000007 + 0x0100000c}


  local cpusubtype = ({ x86={3}, x64={3}, arm={3,9}, arm64={3,0} })[ctx.arch]


converted to sum_cpusubtype = { x86 = 3, x64 = 3, arm = 3 + 9, arm64 = 3 + 0 }


I don't know how to make this place in the test more clear. See comment that explains these arrays:


  -- Universal Binary can contain executables for more than one                    
  -- CPU architecture. For simplicity, the test compares the sum of                
  -- CPU types and CPU subtypes.                                                   
  -- Numbers below are defined in <src/jit/bcsave.lua:bcsave_machobj>.             
  -- The original source is the XNU source code,                                   
  -- <osfmk/mach/machine.h> [1].                                                   
  --   


Ignored.


Also, it is preferable to use upper case since values are constants.


Made everything uppercase.


+
+-- The function builds Mach-O FAT object file and retrieves
+-- its header fields (magic and nfat_arch)
+-- and fields of the each arch (cputype, cpusubtype).
Typo: s/the each/each/
Fixed. Also line above was underfilled, fixed that.

+--
+-- Mach-O FAT object header can be retrieved with `otool` on macOS:
Typo: s/Mach-O FAT object/A Mach-O FAT object/
Fixed to "The Mach-O FAT object".
Typo: s/header/headers/
single header, why "s"?

      
+--
+-- $ otool -f empty.o
+-- Fat headers
+-- fat_magic 0xcafebabe
+-- nfat_arch 2
+-- <snipped>
Minor: it is better to separate quoting in the comment like the
following to simplify reading:

| $ otool -f empty.o
| Fat headers
| fat_magic 0xcafebabe
| nfat_arch 2
| ...

Feel free to ignore.
Ignored.

+--
+-- CPU type and subtype can be retrieved with `lipo` on macOS:
+--
+-- $ luajit -b -o osx -a arm empty.lua empty.o
+-- $ lipo -archs empty.o
+-- i386 armv7
+-- $ luajit -b -o osx -a arm64 empty.lua empty.o
+-- $ lipo -archs empty.o
+-- x86_64 arm64
+local function build_and_check_mach_o(hw_arch)
Maybe it is better to use a subtest here:
The number of tests is fixed for this function, and it becomes more
clear when we add the additional arch for testing.

+  assert(hw_arch == 'arm' or hw_arch == 'arm64')
+
+  -- FAT_MAGIC is an integer containing the value 0xCAFEBABE in
+  -- big-endian byte order format. On a big-endian host CPU,
+  -- this can be validated using the constant FAT_MAGIC;
+  -- on a little-endian host CPU, it can be validated using
+  -- the constant FAT_CIGAM.
+  --
+  -- FAT_NARCH is an integer specifying the number of fat_arch
+  -- data structures that follow. This is the number of
+  -- architectures contained in this binary.
+  --
+  -- See aforementioned "OS X ABI Mach-O File Format Reference".
Typo: s/aforementioned/the aforementioned/

With added article I need to move name of the reference to a next line,

and line looks under-filled and ugly.


      
+  --
Excess empty line.

It is a visual splitter.



+  local FAT_MAGIC = '0xffffffffcafebabe'
+  local FAT_NARCH = 2
+
+  local MODULE_NAME = 'lango_team'
+
+  local mach_o_obj_path = create_obj_file(MODULE_NAME, hw_arch)
+  local mach_o_buf = require('utils').tools.read_file(mach_o_obj_path)
Please move `require` of the necessary tool to the start of the file to
be consistent with the code style in tests.

moved.


I found that 20 tests in our repository are not consistent too:

test/tarantool-tests/lj-946-print-errors-from-gc-fin-default.test.lua:local script = require('utils').exec.makecmd(arg, { redirect = '2>&1' })
test/tarantool-tests/unit-jit-parse.test.lua:local jparse = require('utils').jit.parse
test/tarantool-tests/lj-981-folding-0.test.lua:local jparse = require('utils').jit.parse
test/tarantool-tests/lj-366-strtab-correct-size.test.lua:  local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
test/tarantool-tests/lj-366-strtab-correct-size.test.lua:local elf_content = require('utils').tools.read_file(elf_filename)
test/tarantool-tests/fix-argv2ctype-cts-L-init.test.lua:local script = require('utils').exec.makecmd(arg)
grep: test/tarantool-tests/.lj-736-BC_UCLO-triggers-infinite-loop.lua.swp: binary file matches
test/tarantool-tests/fix-mips64-spare-side-exit-patching.test.lua:local generators = require('utils').jit.generators
test/tarantool-tests/fix-mips64-spare-side-exit-patching.test.lua:local frontend = require('utils').frontend
test/tarantool-tests/lj-1128-table-new-opt-tnew.test.lua:local jparse = require('utils').jit.parse
test/tarantool-tests/gh-6098-fix-side-exit-patching-on-arm64.test.lua:local generators = require('utils').jit.generators
test/tarantool-tests/gh-6098-fix-side-exit-patching-on-arm64.test.lua:local frontend = require('utils').frontend
test/tarantool-tests/lj-351-print-tostring-number.test.lua:local script = require('utils').exec.makecmd(arg)
test/tarantool-tests/lj-flush-on-trace.test.lua:local script = require('utils').exec.makecmd(arg, {
test/tarantool-tests/lj-366-strtab-correct-size.test.lua_:  local lua_bin = require('utils').exec.luacmd(arg):match('%S+')
test/tarantool-tests/lj-366-strtab-correct-size.test.lua_:local elf_content = require('utils').tools.read_file(elf_filename)
test/tarantool-tests/fix-gc-setupvalue.test.lua:local gcisblack = require('utils').gc.isblack
test/tarantool-tests/misclib-getmetrics-lapi.test.lua:local MAXNINS = require('utils').jit.const.maxnins
test/tarantool-tests/gh-4427-ffi-sandwich.test.lua:local script = require('utils').exec.makecmd(arg, {
test/tarantool-tests/fix-jit-dump-ir-conv.test.lua:local jparse = require('utils').jit.parse
test/tarantool-tests/lj-802-panic-at-mcode-protfail.test.lua:local script = require('utils').exec.makecmd(arg, {
test/tarantool-tests/lj-962-stack-overflow-report.test.lua:local LUABIN = require('utils').exec.luabin(arg)



+  assert(mach_o_buf == nil or #mach_o_buf ~= 0, 'cannot read an object file')
Should it be:
| mach_o_buf ~= nil and #mach_o_buf ~= 0


Fixed to " assert(mach_o_buf ~= nil and #mach_o_buf ~= 0, 'cannot read an object file')".



+
+  local mach_o = read_mach_o(mach_o_buf, hw_arch)
+
+  -- Teardown.
+  local retcode = os.remove(mach_o_obj_path)
+  assert(retcode == true, 'remove an object file')
Minor: I suggest refactoring this as the following (`retcode` isn't used
anywhere else):

| assert(os.remove(mach_o_obj_path), 'remove an object file')


Updated.


+
+  local magic_str = string.format('0x%02x', mach_o.header.magic)
Why do we need 02 in the format string?
Also, you may use '%#x' for the same result.

Fixed.



+  test:is(magic_str, FAT_MAGIC,
+          'fat_magic is correct in Mach-O, ' .. hw_arch)
+  test:is(mach_o.header.nfat_arch, FAT_NARCH,
+          'nfat_arch is correct in Mach-O, ' .. hw_arch)
+
+  local total_cputype = 0
+  local total_cpusubtype = 0
+  for i = 1, mach_o.header.nfat_arch do
Minor: I suggest using `FAT_NARCH` here since we've checked their
equality above.

Updated.



+    total_cputype = total_cputype + mach_o.fat_arch[i].cputype
+    total_cpusubtype = total_cpusubtype + mach_o.fat_arch[i].cpusubtype
+  end
+  test:is(total_cputype, sum_cputype[hw_arch],
+          'cputype is correct in Mach-O, ' .. hw_arch)
+  test:is(total_cpusubtype, sum_cpusubtype[hw_arch],
+          'cpusubtype is correct in Mach-O, ' .. hw_arch)
+end
+
+build_and_check_mach_o('arm')
+
+test:done(true)
-- 
2.44.0

    
--------------SUPZZHcVC6xhCPA290klyD7m--