[Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.

Sergey Ostanevich sergos at tarantool.org
Tue Jun 28 22:47:26 MSK 2022


Thanks! 

LGTM.

Best regards,
Sergos 

Tuesday, 28 June 2022, 22:07 +0300 from Kaplun Sergey  <skaplun at tarantool.org>:
>Hi, Sergos!
>
>Thanks for the review!
>
>I've updated commit message to the following:
>
>===================================================================
>FFI/ARM64: Fix pass-by-value struct calling conventions.
>
>(cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>
>If the argument type is a Composite Type then the size of the
>argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>backend makes this rounding unconditionally for arm64 architecture and
>uses the result value to determine the necessary amount of registers
>for the call.
>
>The arm64 parameters passing rules for Homogeneous Floating-point
>Aggregates (HFA) are the following [1]:
>| If the argument is an HFA and there are sufficient unallocated
>| Floating-point registers, then the argument is allocated to
>| Floating-point registers (with one register per member of the HFA).
>
>So for the HFA composed of 3 32-bit float members the extra one (4th)
>register is supposed as occupied.
>
>When the second parameter passed to the function these fields are loaded
>starting from 5th register. As far as the procedure to call uses 6
>registers according to the ABI it leads to the wrong value of the second
>parameter passed to the callee (0 due to the corresponding memset in
>`cconv_struct_init()`).
>
>This patch fixes this case by using the real size of structure in the
>calculation of necessary amount of registers to use.
>
>Sergey Kaplun:
>* added the description and the test for the problem
>
>[1]:  https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parameter-passing-rules
>
>Part of tarantool/tarantool#6548
>===================================================================
>
>On 02.06.22, sergos wrote:
>> Hi!
>> 
>> Thanks for the patch.
>> Some minor updates to the wording, test updates requested.
>> 
>> Regards,
>> Sergos
>> 
>> > On 9 Dec 2021, at 13:24, Sergey Kaplun < skaplun at tarantool.org > wrote:
>> > 
>> > From: Mike Pall <mike>
>> > 
>> > (cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>> > 
>> > The arm64 parameters passing rules for Homogeneous Floating-point
>> > Aggregates (HFA) are the following [1]:
>> > * If the argument type is an HFA or an HVA, then the argument is used
>> 
>> There’s no explanation of HVA here, unlike HFA. Should it help?
>> 
>> >  unmodified.
>> > * If the argument is an HFA and there are sufficient unallocated
>> >  Floating-point registers, then the argument is allocated to
>> >  Floating-pointRegisters (with one register per member of the HFA).
>> > 
>> > Also, if the argument type is a Composite Type then the size of the
>> > argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>> > backend makes this rounding unconditionally for arm64 architecture and
>> > uses the result value to determine the necessary amount of registers
>> > for the call. So for the HFA composed of 2n + 1 (< 4) float members the
>> 
>> I suppose you mean 32bit FP value here. 
>> 
>> > extra one register is supposed as occupied. This leads to the wrong
>> > value (0 due to the corresponding memset in `cconv_struct_init()`) in
>> > this register if the other one parameter is passed to the procedure.
>> 
>> If I read it correct: “the HFA compliance zero-filled padding 
>> can be referred to as a next argument passed”?
>> 
>> > 
>> > This patch fixes this case by using the real size of structure in the
>> > calculation of necessary amount of registers to use.
>> > 
>> > Sergey Kaplun:
>> > * added the description and the test for the problem
>> > 
>> > [1]:  https://developer.arm.com/documentation/ihi0055/b/
>> 
>> The link is 404. ARM is known for shuffling the docs, dunno how to pin it.
>> 
>> > 
>> > Part of tarantool/tarantool#6548
>> > ---
>> > OK, there is an important note before you proceed with the patch:
>> > 
>> > `isfp` variable in arm64 may takes the following values:
>> > 0 - for the pointer argument
>> > 1 - each part of the argument (i.e. field in a structure or floating
>> >    point number itself) takes exactly one register
>> > 2 - the structure is HFA (including complex floats) and compact, i.e.
>> >    two fields may be saved into one register
>> > 
>> > Patch fixes the behaviour in the last case, when the structure is HFA
>> > and takes 8*n + 4 bytes. The variable can't take another value exept
>> > mentioned before, IINM (I've checked it several times). So the magic
>> 
>> I don’t see any tests using half precision FP, say _Float16. It is available
>> since GCC 7 at least. It can help with more variations in the size of the
>> HFA - say 2 or 6.
>
>Yes, LuaJIT FFI interface doesn't know about them :). This kind of
>HFA structures is NIY in LuaJIT.
>
>> 
>> > `d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no
>> > idea why Mike's prefered such interesting way to fix the behaviour in
>> > this case... May be he has its own branch with different `isfp` values
>> > and this is necessary for compatibility.
>> > 
>> > Side note: I choose a general name (without mentioning arm64) for this C
>> > "library" in purpose. It may be adjusted in the future for brute force
>> > testing of FFI calling conventions for different architectures.
>> > See also:  https://github.com/facebookarchive/luaffifb/blob/master/test.lua
>> > This link has an interesting example of FFI tests. May be we can create
>> > something similar with autogeneration of functions, structures and
>> > tests (not right now, but later).
>> > 
>> > Branch:  https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> > Tarantool branch:  https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> > 
>> > src/lj_ccall.c                                |  3 +-
>> > test/tarantool-tests/CMakeLists.txt           |  1 +
>> > .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
>> > test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
>> > test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
>> > 5 files changed, 97 insertions(+), 1 deletion(-)
>> > create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
>> > create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
>> > create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c
>> > 
>
><snipped>
>
>> > -- 
>> > 2.33.1
>> > 
>> 
>
>-- 
>Best regards,
>Sergey Kaplun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20220628/e6e739e1/attachment.htm>


More information about the Tarantool-patches mailing list