Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.

Tarantool development patches archive
 help / color / mirror / Atom feed

From: Sergey Ostanevich via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: "Sergey Kaplun" <skaplun@tarantool.org>
Cc: tarantool-patches <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches]  [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.
Date: Tue, 28 Jun 2022 22:47:26 +0300	[thread overview]
Message-ID: <1656445646.859726830@f120.i.mail.ru> (raw)
In-Reply-To: <YrtQ5bJZnTqtk7Ro@root>

[-- Attachment #1: Type: text/plain, Size: 6551 bytes --]


Thanks! 

LGTM.

Best regards,
Sergos 

Tuesday, 28 June 2022, 22:07 +0300 from Kaplun Sergey  <skaplun@tarantool.org>:
>Hi, Sergos!
>
>Thanks for the review!
>
>I've updated commit message to the following:
>
>===================================================================
>FFI/ARM64: Fix pass-by-value struct calling conventions.
>
>(cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>
>If the argument type is a Composite Type then the size of the
>argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>backend makes this rounding unconditionally for arm64 architecture and
>uses the result value to determine the necessary amount of registers
>for the call.
>
>The arm64 parameters passing rules for Homogeneous Floating-point
>Aggregates (HFA) are the following [1]:
>| If the argument is an HFA and there are sufficient unallocated
>| Floating-point registers, then the argument is allocated to
>| Floating-point registers (with one register per member of the HFA).
>
>So for the HFA composed of 3 32-bit float members the extra one (4th)
>register is supposed as occupied.
>
>When the second parameter passed to the function these fields are loaded
>starting from 5th register. As far as the procedure to call uses 6
>registers according to the ABI it leads to the wrong value of the second
>parameter passed to the callee (0 due to the corresponding memset in
>`cconv_struct_init()`).
>
>This patch fixes this case by using the real size of structure in the
>calculation of necessary amount of registers to use.
>
>Sergey Kaplun:
>* added the description and the test for the problem
>
>[1]:  https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parameter-passing-rules
>
>Part of tarantool/tarantool#6548
>===================================================================
>
>On 02.06.22, sergos wrote:
>> Hi!
>> 
>> Thanks for the patch.
>> Some minor updates to the wording, test updates requested.
>> 
>> Regards,
>> Sergos
>> 
>> > On 9 Dec 2021, at 13:24, Sergey Kaplun < skaplun@tarantool.org > wrote:
>> > 
>> > From: Mike Pall <mike>
>> > 
>> > (cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>> > 
>> > The arm64 parameters passing rules for Homogeneous Floating-point
>> > Aggregates (HFA) are the following [1]:
>> > * If the argument type is an HFA or an HVA, then the argument is used
>> 
>> There’s no explanation of HVA here, unlike HFA. Should it help?
>> 
>> >  unmodified.
>> > * If the argument is an HFA and there are sufficient unallocated
>> >  Floating-point registers, then the argument is allocated to
>> >  Floating-pointRegisters (with one register per member of the HFA).
>> > 
>> > Also, if the argument type is a Composite Type then the size of the
>> > argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>> > backend makes this rounding unconditionally for arm64 architecture and
>> > uses the result value to determine the necessary amount of registers
>> > for the call. So for the HFA composed of 2n + 1 (< 4) float members the
>> 
>> I suppose you mean 32bit FP value here. 
>> 
>> > extra one register is supposed as occupied. This leads to the wrong
>> > value (0 due to the corresponding memset in `cconv_struct_init()`) in
>> > this register if the other one parameter is passed to the procedure.
>> 
>> If I read it correct: “the HFA compliance zero-filled padding 
>> can be referred to as a next argument passed”?
>> 
>> > 
>> > This patch fixes this case by using the real size of structure in the
>> > calculation of necessary amount of registers to use.
>> > 
>> > Sergey Kaplun:
>> > * added the description and the test for the problem
>> > 
>> > [1]:  https://developer.arm.com/documentation/ihi0055/b/
>> 
>> The link is 404. ARM is known for shuffling the docs, dunno how to pin it.
>> 
>> > 
>> > Part of tarantool/tarantool#6548
>> > ---
>> > OK, there is an important note before you proceed with the patch:
>> > 
>> > `isfp` variable in arm64 may takes the following values:
>> > 0 - for the pointer argument
>> > 1 - each part of the argument (i.e. field in a structure or floating
>> >    point number itself) takes exactly one register
>> > 2 - the structure is HFA (including complex floats) and compact, i.e.
>> >    two fields may be saved into one register
>> > 
>> > Patch fixes the behaviour in the last case, when the structure is HFA
>> > and takes 8*n + 4 bytes. The variable can't take another value exept
>> > mentioned before, IINM (I've checked it several times). So the magic
>> 
>> I don’t see any tests using half precision FP, say _Float16. It is available
>> since GCC 7 at least. It can help with more variations in the size of the
>> HFA - say 2 or 6.
>
>Yes, LuaJIT FFI interface doesn't know about them :). This kind of
>HFA structures is NIY in LuaJIT.
>
>> 
>> > `d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no
>> > idea why Mike's prefered such interesting way to fix the behaviour in
>> > this case... May be he has its own branch with different `isfp` values
>> > and this is necessary for compatibility.
>> > 
>> > Side note: I choose a general name (without mentioning arm64) for this C
>> > "library" in purpose. It may be adjusted in the future for brute force
>> > testing of FFI calling conventions for different architectures.
>> > See also:  https://github.com/facebookarchive/luaffifb/blob/master/test.lua
>> > This link has an interesting example of FFI tests. May be we can create
>> > something similar with autogeneration of functions, structures and
>> > tests (not right now, but later).
>> > 
>> > Branch:  https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> > Tarantool branch:  https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> > 
>> > src/lj_ccall.c                                |  3 +-
>> > test/tarantool-tests/CMakeLists.txt           |  1 +
>> > .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
>> > test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
>> > test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
>> > 5 files changed, 97 insertions(+), 1 deletion(-)
>> > create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
>> > create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
>> > create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c
>> > 
>
><snipped>
>
>> > -- 
>> > 2.33.1
>> > 
>> 
>
>-- 
>Best regards,
>Sergey Kaplun

[-- Attachment #2: Type: text/html, Size: 10641 bytes --]

next prev parent reply	other threads:[~2022-06-28 19:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 10:24 [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Sergey Kaplun via Tarantool-patches
2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
2022-06-02  8:51   ` sergos via Tarantool-patches
2022-06-29  8:31   ` Igor Munkin via Tarantool-patches
2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions Sergey Kaplun via Tarantool-patches
2022-06-02  8:51   ` sergos via Tarantool-patches
2022-06-28 19:05     ` Sergey Kaplun via Tarantool-patches
2022-06-28 19:47       ` Sergey Ostanevich via Tarantool-patches [this message]
2022-06-29  8:45       ` Igor Munkin via Tarantool-patches
2022-06-30 12:11 ` [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1656445646.859726830@f120.i.mail.ru \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=sergos@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches]  [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox