From: sergos via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Sergey Kaplun <skaplun@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions. Date: Thu, 2 Jun 2022 11:51:53 +0300 [thread overview] Message-ID: <CF55B3FC-4A3E-4B1F-B42D-E5CDEB6D510F@tarantool.org> (raw) In-Reply-To: <f30b7aabffa7c9c2f72bc356bb304175a51edee7.1639039919.git.skaplun@tarantool.org> Hi! Thanks for the patch. Some minor updates to the wording, test updates requested. Regards, Sergos > On 9 Dec 2021, at 13:24, Sergey Kaplun <skaplun@tarantool.org> wrote: > > From: Mike Pall <mike> > > (cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224) > > The arm64 parameters passing rules for Homogeneous Floating-point > Aggregates (HFA) are the following [1]: > * If the argument type is an HFA or an HVA, then the argument is used There’s no explanation of HVA here, unlike HFA. Should it help? > unmodified. > * If the argument is an HFA and there are sufficient unallocated > Floating-point registers, then the argument is allocated to > Floating-pointRegisters (with one register per member of the HFA). > > Also, if the argument type is a Composite Type then the size of the > argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI > backend makes this rounding unconditionally for arm64 architecture and > uses the result value to determine the necessary amount of registers > for the call. So for the HFA composed of 2n + 1 (< 4) float members the I suppose you mean 32bit FP value here. > extra one register is supposed as occupied. This leads to the wrong > value (0 due to the corresponding memset in `cconv_struct_init()`) in > this register if the other one parameter is passed to the procedure. If I read it correct: “the HFA compliance zero-filled padding can be referred to as a next argument passed”? > > This patch fixes this case by using the real size of structure in the > calculation of necessary amount of registers to use. > > Sergey Kaplun: > * added the description and the test for the problem > > [1]: https://developer.arm.com/documentation/ihi0055/b/ The link is 404. ARM is known for shuffling the docs, dunno how to pin it. > > Part of tarantool/tarantool#6548 > --- > OK, there is an important note before you proceed with the patch: > > `isfp` variable in arm64 may takes the following values: > 0 - for the pointer argument > 1 - each part of the argument (i.e. field in a structure or floating > point number itself) takes exactly one register > 2 - the structure is HFA (including complex floats) and compact, i.e. > two fields may be saved into one register > > Patch fixes the behaviour in the last case, when the structure is HFA > and takes 8*n + 4 bytes. The variable can't take another value exept > mentioned before, IINM (I've checked it several times). So the magic I don’t see any tests using half precision FP, say _Float16. It is available since GCC 7 at least. It can help with more variations in the size of the HFA - say 2 or 6. > `d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no > idea why Mike's prefered such interesting way to fix the behaviour in > this case... May be he has its own branch with different `isfp` values > and this is necessary for compatibility. > > Side note: I choose a general name (without mentioning arm64) for this C > "library" in purpose. It may be adjusted in the future for brute force > testing of FFI calling conventions for different architectures. > See also: https://github.com/facebookarchive/luaffifb/blob/master/test.lua > This link has an interesting example of FFI tests. May be we can create > something similar with autogeneration of functions, structures and > tests (not right now, but later). > > Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci > Tarantool branch: https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci > > src/lj_ccall.c | 3 +- > test/tarantool-tests/CMakeLists.txt | 1 + > .../arm64-ccall-fp-convention.test.lua | 65 +++++++++++++++++++ > test/tarantool-tests/ffi-ccall/CMakeLists.txt | 1 + > test/tarantool-tests/ffi-ccall/libfficcall.c | 28 ++++++++ > 5 files changed, 97 insertions(+), 1 deletion(-) > create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua > create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt > create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c > > diff --git a/src/lj_ccall.c b/src/lj_ccall.c > index f26f1fea..d39ff861 100644 > --- a/src/lj_ccall.c > +++ b/src/lj_ccall.c > @@ -337,7 +337,8 @@ > if (LJ_TARGET_OSX && isva) { \ > /* IOS: All variadic arguments are on the stack. */ \ > } else if (isfp) { /* Try to pass argument in FPRs. */ \ > - int n2 = ctype_isvector(d->info) ? 1 : n*isfp; \ > + int n2 = ctype_isvector(d->info) ? 1 : \ > + isfp == 1 ? n : (d->size >> (4-isfp)); \ > if (nfpr + n2 <= CCALL_NARG_FPR) { \ > dp = &cc->fpr[nfpr]; \ > nfpr += n2; \ > diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt > index 992556d9..50769cda 100644 > --- a/test/tarantool-tests/CMakeLists.txt > +++ b/test/tarantool-tests/CMakeLists.txt > @@ -56,6 +56,7 @@ macro(BuildTestCLib lib sources) > set(LD_LIBRARY_PATH "${CMAKE_CURRENT_BINARY_DIR}:${LD_LIBRARY_PATH}" PARENT_SCOPE) > endmacro() > > +add_subdirectory(ffi-ccall) > add_subdirectory(gh-4427-ffi-sandwich) > add_subdirectory(gh-6098-fix-side-exit-patching-on-arm64) > add_subdirectory(gh-6189-cur_L) > diff --git a/test/tarantool-tests/arm64-ccall-fp-convention.test.lua b/test/tarantool-tests/arm64-ccall-fp-convention.test.lua > new file mode 100644 > index 00000000..c0d2236a > --- /dev/null > +++ b/test/tarantool-tests/arm64-ccall-fp-convention.test.lua > @@ -0,0 +1,65 @@ > +local ffi = require('ffi') > +local tap = require('tap') > + > +local ffi_ccall = ffi.load('libfficcall') > + > +local test = tap.test('ffi c call convention') > +test:plan(3) > + > +ffi.cdef[[ > +struct sz12_t { > + float f1; > + float f2; > + float f3; > +}; > + > +/* Just return the value itself. */ > +struct sz12_t retsz12(struct sz12_t a); > + > +/* Return sum of two. */ > +struct sz12_t add2sz12(struct sz12_t a, struct sz12_t b); > + > +/* Return sum of three. */ > +struct sz12_t add3sz12(struct sz12_t a, struct sz12_t b, struct sz12_t c); > +]] > + > +-- For pretty error messages. > +local sz12_mt = {__tostring = function(sz12) > + return string.format('{%d,%d,%d}', sz12.f1, sz12.f2, sz12.f3) > +end} > + > +local sz12_t = ffi.metatype('struct sz12_t', sz12_mt) > + > +local function new_sz12(f1, f2, f3) > + return sz12_t(f1, f2, f3) > +end > + > +-- For pretty error messages. > +local expected_mt = {__tostring = function(t) > + return '{'..table.concat(t, ',')..'}' > +end} > + > +local function test_sz12(got, expected) > + assert(type(got) == 'cdata') > + assert(type(expected) == 'table') > + expected = setmetatable(expected, expected_mt) > + for i = 1, #expected do > + if got['f'..i] ~= expected[i] then > + error(('bad sz12_t value: got %s, expected %s'):format( > + tostring(got), tostring(expected) > + )) > + end > + end > + return true > +end > + > +-- Test arm64 calling convention for HFA structures. > +-- Need structure which size is not multiple by 8. > +local sz12_111 = new_sz12(1, 1, 1) > +test:ok(test_sz12(sz12_111, {1, 1, 1}), 'base') > +local sz12_222 = ffi_ccall.add2sz12(sz12_111, sz12_111) > +test:ok(test_sz12(sz12_222, {2, 2, 2}), '2 structures as args') > +local sz12_333 = ffi_ccall.add3sz12(sz12_111, sz12_111, sz12_111) > +test:ok(test_sz12(sz12_333, {3, 3, 3}), '3 structures as args') > + > +os.exit(test:check() and 0 or 1) > diff --git a/test/tarantool-tests/ffi-ccall/CMakeLists.txt b/test/tarantool-tests/ffi-ccall/CMakeLists.txt > new file mode 100644 > index 00000000..df937bf8 > --- /dev/null > +++ b/test/tarantool-tests/ffi-ccall/CMakeLists.txt > @@ -0,0 +1 @@ > +BuildTestCLib(libfficcall libfficcall.c) > diff --git a/test/tarantool-tests/ffi-ccall/libfficcall.c b/test/tarantool-tests/ffi-ccall/libfficcall.c > new file mode 100644 > index 00000000..0c043a68 > --- /dev/null > +++ b/test/tarantool-tests/ffi-ccall/libfficcall.c > @@ -0,0 +1,28 @@ > +struct sz12_t { > + float f1; > + float f2; > + float f3; > +}; > + > +struct sz12_t retsz12(struct sz12_t a) > +{ > + return a; > +} > + > +struct sz12_t add2sz12(struct sz12_t a, struct sz12_t b) > +{ > + struct sz12_t res = {0}; > + res.f1 = a.f1 + b.f1; > + res.f2 = a.f2 + b.f2; > + res.f3 = a.f3 + b.f3; > + return res; > +} > + > +struct sz12_t add3sz12(struct sz12_t a, struct sz12_t b, struct sz12_t c) > +{ > + struct sz12_t res = {0}; > + res.f1 = a.f1 + b.f1 + c.f1; > + res.f2 = a.f2 + b.f2 + c.f2; > + res.f3 = a.f3 + b.f3 + c.f3; > + return res; > +} > -- > 2.33.1 >
next prev parent reply other threads:[~2022-06-02 8:52 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-12-09 10:24 [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Sergey Kaplun via Tarantool-patches 2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches 2022-06-02 8:51 ` sergos via Tarantool-patches 2022-06-29 8:31 ` Igor Munkin via Tarantool-patches 2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions Sergey Kaplun via Tarantool-patches 2022-06-02 8:51 ` sergos via Tarantool-patches [this message] 2022-06-28 19:05 ` Sergey Kaplun via Tarantool-patches 2022-06-28 19:47 ` Sergey Ostanevich via Tarantool-patches 2022-06-29 8:45 ` Igor Munkin via Tarantool-patches 2022-06-30 12:11 ` [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Igor Munkin via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CF55B3FC-4A3E-4B1F-B42D-E5CDEB6D510F@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=sergos@tarantool.org \ --cc=skaplun@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox