From: Sergey Ostanevich via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: "Sergey Kaplun" <skaplun@tarantool.org>
Cc: tarantool-patches <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.
Date: Tue, 28 Jun 2022 22:47:26 +0300 [thread overview]
Message-ID: <1656445646.859726830@f120.i.mail.ru> (raw)
In-Reply-To: <YrtQ5bJZnTqtk7Ro@root>
[-- Attachment #1: Type: text/plain, Size: 6551 bytes --]
Thanks!
LGTM.
Best regards,
Sergos
Tuesday, 28 June 2022, 22:07 +0300 from Kaplun Sergey <skaplun@tarantool.org>:
>Hi, Sergos!
>
>Thanks for the review!
>
>I've updated commit message to the following:
>
>===================================================================
>FFI/ARM64: Fix pass-by-value struct calling conventions.
>
>(cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>
>If the argument type is a Composite Type then the size of the
>argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>backend makes this rounding unconditionally for arm64 architecture and
>uses the result value to determine the necessary amount of registers
>for the call.
>
>The arm64 parameters passing rules for Homogeneous Floating-point
>Aggregates (HFA) are the following [1]:
>| If the argument is an HFA and there are sufficient unallocated
>| Floating-point registers, then the argument is allocated to
>| Floating-point registers (with one register per member of the HFA).
>
>So for the HFA composed of 3 32-bit float members the extra one (4th)
>register is supposed as occupied.
>
>When the second parameter passed to the function these fields are loaded
>starting from 5th register. As far as the procedure to call uses 6
>registers according to the ABI it leads to the wrong value of the second
>parameter passed to the callee (0 due to the corresponding memset in
>`cconv_struct_init()`).
>
>This patch fixes this case by using the real size of structure in the
>calculation of necessary amount of registers to use.
>
>Sergey Kaplun:
>* added the description and the test for the problem
>
>[1]: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parameter-passing-rules
>
>Part of tarantool/tarantool#6548
>===================================================================
>
>On 02.06.22, sergos wrote:
>> Hi!
>>
>> Thanks for the patch.
>> Some minor updates to the wording, test updates requested.
>>
>> Regards,
>> Sergos
>>
>> > On 9 Dec 2021, at 13:24, Sergey Kaplun < skaplun@tarantool.org > wrote:
>> >
>> > From: Mike Pall <mike>
>> >
>> > (cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>> >
>> > The arm64 parameters passing rules for Homogeneous Floating-point
>> > Aggregates (HFA) are the following [1]:
>> > * If the argument type is an HFA or an HVA, then the argument is used
>>
>> There’s no explanation of HVA here, unlike HFA. Should it help?
>>
>> > unmodified.
>> > * If the argument is an HFA and there are sufficient unallocated
>> > Floating-point registers, then the argument is allocated to
>> > Floating-pointRegisters (with one register per member of the HFA).
>> >
>> > Also, if the argument type is a Composite Type then the size of the
>> > argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>> > backend makes this rounding unconditionally for arm64 architecture and
>> > uses the result value to determine the necessary amount of registers
>> > for the call. So for the HFA composed of 2n + 1 (< 4) float members the
>>
>> I suppose you mean 32bit FP value here.
>>
>> > extra one register is supposed as occupied. This leads to the wrong
>> > value (0 due to the corresponding memset in `cconv_struct_init()`) in
>> > this register if the other one parameter is passed to the procedure.
>>
>> If I read it correct: “the HFA compliance zero-filled padding
>> can be referred to as a next argument passed”?
>>
>> >
>> > This patch fixes this case by using the real size of structure in the
>> > calculation of necessary amount of registers to use.
>> >
>> > Sergey Kaplun:
>> > * added the description and the test for the problem
>> >
>> > [1]: https://developer.arm.com/documentation/ihi0055/b/
>>
>> The link is 404. ARM is known for shuffling the docs, dunno how to pin it.
>>
>> >
>> > Part of tarantool/tarantool#6548
>> > ---
>> > OK, there is an important note before you proceed with the patch:
>> >
>> > `isfp` variable in arm64 may takes the following values:
>> > 0 - for the pointer argument
>> > 1 - each part of the argument (i.e. field in a structure or floating
>> > point number itself) takes exactly one register
>> > 2 - the structure is HFA (including complex floats) and compact, i.e.
>> > two fields may be saved into one register
>> >
>> > Patch fixes the behaviour in the last case, when the structure is HFA
>> > and takes 8*n + 4 bytes. The variable can't take another value exept
>> > mentioned before, IINM (I've checked it several times). So the magic
>>
>> I don’t see any tests using half precision FP, say _Float16. It is available
>> since GCC 7 at least. It can help with more variations in the size of the
>> HFA - say 2 or 6.
>
>Yes, LuaJIT FFI interface doesn't know about them :). This kind of
>HFA structures is NIY in LuaJIT.
>
>>
>> > `d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no
>> > idea why Mike's prefered such interesting way to fix the behaviour in
>> > this case... May be he has its own branch with different `isfp` values
>> > and this is necessary for compatibility.
>> >
>> > Side note: I choose a general name (without mentioning arm64) for this C
>> > "library" in purpose. It may be adjusted in the future for brute force
>> > testing of FFI calling conventions for different architectures.
>> > See also: https://github.com/facebookarchive/luaffifb/blob/master/test.lua
>> > This link has an interesting example of FFI tests. May be we can create
>> > something similar with autogeneration of functions, structures and
>> > tests (not right now, but later).
>> >
>> > Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> > Tarantool branch: https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> >
>> > src/lj_ccall.c | 3 +-
>> > test/tarantool-tests/CMakeLists.txt | 1 +
>> > .../arm64-ccall-fp-convention.test.lua | 65 +++++++++++++++++++
>> > test/tarantool-tests/ffi-ccall/CMakeLists.txt | 1 +
>> > test/tarantool-tests/ffi-ccall/libfficcall.c | 28 ++++++++
>> > 5 files changed, 97 insertions(+), 1 deletion(-)
>> > create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
>> > create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
>> > create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c
>> >
>
><snipped>
>
>> > --
>> > 2.33.1
>> >
>>
>
>--
>Best regards,
>Sergey Kaplun
[-- Attachment #2: Type: text/html, Size: 10641 bytes --]
next prev parent reply other threads:[~2022-06-28 19:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-09 10:24 [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Sergey Kaplun via Tarantool-patches
2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
2022-06-02 8:51 ` sergos via Tarantool-patches
2022-06-29 8:31 ` Igor Munkin via Tarantool-patches
2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions Sergey Kaplun via Tarantool-patches
2022-06-02 8:51 ` sergos via Tarantool-patches
2022-06-28 19:05 ` Sergey Kaplun via Tarantool-patches
2022-06-28 19:47 ` Sergey Ostanevich via Tarantool-patches [this message]
2022-06-29 8:45 ` Igor Munkin via Tarantool-patches
2022-06-30 12:11 ` [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Igor Munkin via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1656445646.859726830@f120.i.mail.ru \
--to=tarantool-patches@dev.tarantool.org \
--cc=sergos@tarantool.org \
--cc=skaplun@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox