Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA
@ 2021-12-09 10:24 Sergey Kaplun via Tarantool-patches
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-12-09 10:24 UTC (permalink / raw)
  To: Sergey Ostanevich, Igor Munkin; +Cc: tarantool-patches

The first patch is necessary to be able to run tests using `ffi.load`
without selfrun. We need just to use the set environment variable
itself. And the second patch is an actual fix.

Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
Tarantool branch: https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci

Mike Pall (1):
  FFI/ARM64: Fix pass-by-value struct calling conventions.

Sergey Kaplun (1):
  test: set DYLD_LIBRARY_PATH environment variable

 src/lj_ccall.c                                |  3 +-
 test/tarantool-tests/CMakeLists.txt           | 20 +++---
 .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
 test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
 test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
 .../gh-4427-ffi-sandwich.test.lua             |  4 --
 .../lj-flush-on-trace.test.lua                |  4 --
 test/tarantool-tests/utils.lua                | 17 -----
 8 files changed, 108 insertions(+), 34 deletions(-)
 create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
 create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
 create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c

-- 
2.33.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable
  2021-12-09 10:24 [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Sergey Kaplun via Tarantool-patches
@ 2021-12-09 10:24 ` Sergey Kaplun via Tarantool-patches
  2022-06-02  8:51   ` sergos via Tarantool-patches
  2022-06-29  8:31   ` Igor Munkin via Tarantool-patches
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions Sergey Kaplun via Tarantool-patches
  2022-06-30 12:11 ` [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Igor Munkin via Tarantool-patches
  2 siblings, 2 replies; 10+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-12-09 10:24 UTC (permalink / raw)
  To: Sergey Ostanevich, Igor Munkin; +Cc: tarantool-patches

This patch allows to use tests with `ffi.load()` without usage of
`utils.selfrun()` by setting the DYLD_LIBRARY_PATH variable for the
process directly via additional env command. Also, this commit "reverts"
part of the commit fae1681dd1c117a913f99cbeed0ca2b87cf581a1 ('test: fix
dynamic modules loading on MacOS') since there is no need to tweak
environment while running those tests.

Needed for tarantool/tarantool#6548
---
 test/tarantool-tests/CMakeLists.txt           | 19 +++++++++++--------
 .../gh-4427-ffi-sandwich.test.lua             |  4 ----
 .../lj-flush-on-trace.test.lua                |  4 ----
 test/tarantool-tests/utils.lua                | 17 -----------------
 4 files changed, 11 insertions(+), 33 deletions(-)

diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
index a872fa5e..992556d9 100644
--- a/test/tarantool-tests/CMakeLists.txt
+++ b/test/tarantool-tests/CMakeLists.txt
@@ -94,22 +94,21 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Darwin")
   # since some programs sanitize the environment before they start
   # child processes. Specifically, environment variables starting
   # with DYLD_ and LD_ are unset for child process started by
-  # system programs (like /usr/bin/env used for preparing testing
-  # environment). For more info, see the docs[2] below.
+  # other programs (like /usr/bin/prove --exec using for launching
+  # test suite). For more info, see the docs[2] below.
   #
   # These environment variables are used by FFI machinery to find
   # the proper shared library, hence we can still tweak testing
   # environment before calling <ffi.load>. However, the value
   # can't be passed via the standard environment variable, so we
-  # prepend TEST_ prefix to its name to get around SIP magic
-  # tricks. Finally, to set the variable required by FFI machinery
-  # the introduced <utils.tweakenv> routine is used.
+  # use env call in prove's --exec flag value to get around SIP
+  # magic tricks.
   #
   # [1]: https://support.apple.com/en-us/HT204899
   # [2]: https://developer.apple.com/library/archive/documentation/Security/Conceptual/System_Integrity_Protection_Guide/RuntimeProtections/RuntimeProtections.html
-  list(APPEND LUA_TEST_ENV TEST_DYLD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
+  list(APPEND LUA_TEST_ENV_MORE DYLD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
 else()
-  list(APPEND LUA_TEST_ENV LD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
+  list(APPEND LUA_TEST_ENV_MORE LD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
 endif()
 
 # LUA_CPATH and LD_LIBRARY_PATH variables and also TESTLIBS list
@@ -120,10 +119,14 @@ add_custom_target(tarantool-tests
 add_custom_command(TARGET tarantool-tests
   COMMENT "Running Tarantool tests"
   COMMAND
+  # XXX: We can't move everything to the "inner" env, since there
+  # are some issues with escaping ';' for different shells. As
+  # a result LUA_PATH/LUA_CPATH variables are set via the "outer"
+  # env, since they are not stripped by SIP like LD_*/DYLD_* are.
   env
     ${LUA_TEST_ENV}
     ${PROVE} ${CMAKE_CURRENT_SOURCE_DIR}
-      --exec '${LUAJIT_TEST_COMMAND}'
+      --exec 'env ${LUA_TEST_ENV_MORE} ${LUAJIT_TEST_COMMAND}'
       --ext ${LUA_TEST_SUFFIX}
       ${LUA_TEST_FLAGS}
   WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
diff --git a/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua b/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua
index 3e6801a5..dd02130c 100644
--- a/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua
+++ b/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua
@@ -3,10 +3,6 @@ local utils = require('utils')
 -- Disabled on *BSD due to #4819.
 utils.skipcond(jit.os == 'BSD', 'Disabled due to #4819')
 
--- XXX: Tweak the process environment to get around SIP.
--- See the comment in suite CMakeLists.txt for more info.
-utils.tweakenv(jit.os == 'OSX', 'DYLD_LIBRARY_PATH')
-
 utils.selfrun(arg, {
   {
     arg = {
diff --git a/test/tarantool-tests/lj-flush-on-trace.test.lua b/test/tarantool-tests/lj-flush-on-trace.test.lua
index 81fd6e8f..c46b93f0 100644
--- a/test/tarantool-tests/lj-flush-on-trace.test.lua
+++ b/test/tarantool-tests/lj-flush-on-trace.test.lua
@@ -3,10 +3,6 @@ local utils = require('utils')
 -- Disabled on *BSD due to #4819.
 utils.skipcond(jit.os == 'BSD', 'Disabled due to #4819')
 
--- XXX: Tweak the process environment to get around SIP.
--- See the comment in suite CMakeLists.txt for more info.
-utils.tweakenv(jit.os == 'OSX', 'DYLD_LIBRARY_PATH')
-
 utils.selfrun(arg, {
   {
     arg = {
diff --git a/test/tarantool-tests/utils.lua b/test/tarantool-tests/utils.lua
index 5bd42b30..3b809515 100644
--- a/test/tarantool-tests/utils.lua
+++ b/test/tarantool-tests/utils.lua
@@ -1,13 +1,8 @@
 local M = {}
 
-local ffi = require('ffi')
 local tap = require('tap')
 local bc = require('jit.bc')
 
-ffi.cdef([[
-  int setenv(const char *name, const char *value, int overwrite);
-]])
-
 local function luacmd(args)
   -- arg[-1] is guaranteed to be not nil.
   local idx = -2
@@ -78,18 +73,6 @@ function M.skipcond(condition, message)
   os.exit(test:check() and 0 or 1)
 end
 
-function M.tweakenv(condition, variable)
-  if not condition or os.getenv(variable) then return end
-  local testvar = assert(os.getenv('TEST_' .. variable),
-                         ('Neither %s nor auxiliary TEST_%s variables are set')
-                         :format(variable, variable))
-  -- XXX: The third argument of setenv(3) is set to zero to forbid
-  -- overwriting the <variable>. Since there is the check above
-  -- whether this <variable> is set in the process environment, it
-  -- just makes this solution foolproof.
-  ffi.C.setenv(variable, testvar, 0)
-end
-
 function M.hasbc(f, bytecode)
   assert(type(f) == 'function', 'argument #1 should be a function')
   assert(type(bytecode) == 'string', 'argument #2 should be a string')
-- 
2.33.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.
  2021-12-09 10:24 [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Sergey Kaplun via Tarantool-patches
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
@ 2021-12-09 10:24 ` Sergey Kaplun via Tarantool-patches
  2022-06-02  8:51   ` sergos via Tarantool-patches
  2022-06-30 12:11 ` [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Igor Munkin via Tarantool-patches
  2 siblings, 1 reply; 10+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2021-12-09 10:24 UTC (permalink / raw)
  To: Sergey Ostanevich, Igor Munkin; +Cc: tarantool-patches

From: Mike Pall <mike>

(cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)

The arm64 parameters passing rules for Homogeneous Floating-point
Aggregates (HFA) are the following [1]:
* If the argument type is an HFA or an HVA, then the argument is used
  unmodified.
* If the argument is an HFA and there are sufficient unallocated
  Floating-point registers, then the argument is allocated to
  Floating-pointRegisters (with one register per member of the HFA).

Also, if the argument type is a Composite Type then the size of the
argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
backend makes this rounding unconditionally for arm64 architecture and
uses the result value to determine the necessary amount of registers
for the call. So for the HFA composed of 2n + 1 (< 4) float members the
extra one register is supposed as occupied. This leads to the wrong
value (0 due to the corresponding memset in `cconv_struct_init()`) in
this register if the other one parameter is passed to the procedure.

This patch fixes this case by using the real size of structure in the
calculation of necessary amount of registers to use.

Sergey Kaplun:
* added the description and the test for the problem

[1]: https://developer.arm.com/documentation/ihi0055/b/

Part of tarantool/tarantool#6548
---
OK, there is an important note before you proceed with the patch:

`isfp` variable in arm64 may takes the following values:
0 - for the pointer argument
1 - each part of the argument (i.e. field in a structure or floating
    point number itself) takes exactly one register
2 - the structure is HFA (including complex floats) and compact, i.e.
    two fields may be saved into one register

Patch fixes the behaviour in the last case, when the structure is HFA
and takes 8*n + 4 bytes. The variable can't take another value exept
mentioned before, IINM (I've checked it several times). So the magic
`d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no
idea why Mike's prefered such interesting way to fix the behaviour in
this case... May be he has its own branch with different `isfp` values
and this is necessary for compatibility.

Side note: I choose a general name (without mentioning arm64) for this C
"library" in purpose. It may be adjusted in the future for brute force
testing of FFI calling conventions for different architectures.
See also: https://github.com/facebookarchive/luaffifb/blob/master/test.lua
This link has an interesting example of FFI tests. May be we can create
something similar with autogeneration of functions, structures and
tests (not right now, but later).

Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
Tarantool branch: https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci

 src/lj_ccall.c                                |  3 +-
 test/tarantool-tests/CMakeLists.txt           |  1 +
 .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
 test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
 test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
 5 files changed, 97 insertions(+), 1 deletion(-)
 create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
 create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
 create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c

diff --git a/src/lj_ccall.c b/src/lj_ccall.c
index f26f1fea..d39ff861 100644
--- a/src/lj_ccall.c
+++ b/src/lj_ccall.c
@@ -337,7 +337,8 @@
   if (LJ_TARGET_OSX && isva) { \
     /* IOS: All variadic arguments are on the stack. */ \
   } else if (isfp) {  /* Try to pass argument in FPRs. */ \
-    int n2 = ctype_isvector(d->info) ? 1 : n*isfp; \
+    int n2 = ctype_isvector(d->info) ? 1 : \
+	     isfp == 1 ? n : (d->size >> (4-isfp)); \
     if (nfpr + n2 <= CCALL_NARG_FPR) { \
       dp = &cc->fpr[nfpr]; \
       nfpr += n2; \
diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
index 992556d9..50769cda 100644
--- a/test/tarantool-tests/CMakeLists.txt
+++ b/test/tarantool-tests/CMakeLists.txt
@@ -56,6 +56,7 @@ macro(BuildTestCLib lib sources)
   set(LD_LIBRARY_PATH "${CMAKE_CURRENT_BINARY_DIR}:${LD_LIBRARY_PATH}" PARENT_SCOPE)
 endmacro()
 
+add_subdirectory(ffi-ccall)
 add_subdirectory(gh-4427-ffi-sandwich)
 add_subdirectory(gh-6098-fix-side-exit-patching-on-arm64)
 add_subdirectory(gh-6189-cur_L)
diff --git a/test/tarantool-tests/arm64-ccall-fp-convention.test.lua b/test/tarantool-tests/arm64-ccall-fp-convention.test.lua
new file mode 100644
index 00000000..c0d2236a
--- /dev/null
+++ b/test/tarantool-tests/arm64-ccall-fp-convention.test.lua
@@ -0,0 +1,65 @@
+local ffi = require('ffi')
+local tap = require('tap')
+
+local ffi_ccall = ffi.load('libfficcall')
+
+local test = tap.test('ffi c call convention')
+test:plan(3)
+
+ffi.cdef[[
+struct sz12_t {
+	float f1;
+	float f2;
+	float f3;
+};
+
+/* Just return the value itself. */
+struct sz12_t retsz12(struct sz12_t a);
+
+/* Return sum of two. */
+struct sz12_t add2sz12(struct sz12_t a, struct sz12_t b);
+
+/* Return sum of three. */
+struct sz12_t add3sz12(struct sz12_t a, struct sz12_t b, struct sz12_t c);
+]]
+
+-- For pretty error messages.
+local sz12_mt = {__tostring = function(sz12)
+  return string.format('{%d,%d,%d}', sz12.f1, sz12.f2, sz12.f3)
+end}
+
+local sz12_t = ffi.metatype('struct sz12_t', sz12_mt)
+
+local function new_sz12(f1, f2, f3)
+  return sz12_t(f1, f2, f3)
+end
+
+-- For pretty error messages.
+local expected_mt = {__tostring = function(t)
+  return '{'..table.concat(t, ',')..'}'
+end}
+
+local function test_sz12(got, expected)
+  assert(type(got) == 'cdata')
+  assert(type(expected) == 'table')
+  expected = setmetatable(expected, expected_mt)
+  for i = 1, #expected do
+    if got['f'..i] ~= expected[i] then
+      error(('bad sz12_t value: got %s, expected %s'):format(
+        tostring(got), tostring(expected)
+      ))
+    end
+  end
+  return true
+end
+
+-- Test arm64 calling convention for HFA structures.
+-- Need structure which size is not multiple by 8.
+local sz12_111 = new_sz12(1, 1, 1)
+test:ok(test_sz12(sz12_111, {1, 1, 1}), 'base')
+local sz12_222 = ffi_ccall.add2sz12(sz12_111, sz12_111)
+test:ok(test_sz12(sz12_222, {2, 2, 2}), '2 structures as args')
+local sz12_333 = ffi_ccall.add3sz12(sz12_111, sz12_111, sz12_111)
+test:ok(test_sz12(sz12_333, {3, 3, 3}), '3 structures as args')
+
+os.exit(test:check() and 0 or 1)
diff --git a/test/tarantool-tests/ffi-ccall/CMakeLists.txt b/test/tarantool-tests/ffi-ccall/CMakeLists.txt
new file mode 100644
index 00000000..df937bf8
--- /dev/null
+++ b/test/tarantool-tests/ffi-ccall/CMakeLists.txt
@@ -0,0 +1 @@
+BuildTestCLib(libfficcall libfficcall.c)
diff --git a/test/tarantool-tests/ffi-ccall/libfficcall.c b/test/tarantool-tests/ffi-ccall/libfficcall.c
new file mode 100644
index 00000000..0c043a68
--- /dev/null
+++ b/test/tarantool-tests/ffi-ccall/libfficcall.c
@@ -0,0 +1,28 @@
+struct sz12_t {
+	float f1;
+	float f2;
+	float f3;
+};
+
+struct sz12_t retsz12(struct sz12_t a)
+{
+	return a;
+}
+
+struct sz12_t add2sz12(struct sz12_t a, struct sz12_t b)
+{
+	struct sz12_t res = {0};
+	res.f1 = a.f1 + b.f1;
+	res.f2 = a.f2 + b.f2;
+	res.f3 = a.f3 + b.f3;
+	return res;
+}
+
+struct sz12_t add3sz12(struct sz12_t a, struct sz12_t b, struct sz12_t c)
+{
+	struct sz12_t res = {0};
+	res.f1 = a.f1 + b.f1 + c.f1;
+	res.f2 = a.f2 + b.f2 + c.f2;
+	res.f3 = a.f3 + b.f3 + c.f3;
+	return res;
+}
-- 
2.33.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
@ 2022-06-02  8:51   ` sergos via Tarantool-patches
  2022-06-29  8:31   ` Igor Munkin via Tarantool-patches
  1 sibling, 0 replies; 10+ messages in thread
From: sergos via Tarantool-patches @ 2022-06-02  8:51 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi!

Thanks for the patch, I can’t find any problems in it, although I couldn’t
apply it on the latest luajit.

Sergos.


> On 9 Dec 2021, at 13:24, Sergey Kaplun <skaplun@tarantool.org> wrote:
> 
> This patch allows to use tests with `ffi.load()` without usage of
> `utils.selfrun()` by setting the DYLD_LIBRARY_PATH variable for the
> process directly via additional env command. Also, this commit "reverts"
> part of the commit fae1681dd1c117a913f99cbeed0ca2b87cf581a1 ('test: fix
> dynamic modules loading on MacOS') since there is no need to tweak
> environment while running those tests.
> 
> Needed for tarantool/tarantool#6548
> ---
> test/tarantool-tests/CMakeLists.txt           | 19 +++++++++++--------
> .../gh-4427-ffi-sandwich.test.lua             |  4 ----
> .../lj-flush-on-trace.test.lua                |  4 ----
> test/tarantool-tests/utils.lua                | 17 -----------------
> 4 files changed, 11 insertions(+), 33 deletions(-)
> 
> diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
> index a872fa5e..992556d9 100644
> --- a/test/tarantool-tests/CMakeLists.txt
> +++ b/test/tarantool-tests/CMakeLists.txt
> @@ -94,22 +94,21 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Darwin")
>   # since some programs sanitize the environment before they start
>   # child processes. Specifically, environment variables starting
>   # with DYLD_ and LD_ are unset for child process started by
> -  # system programs (like /usr/bin/env used for preparing testing
> -  # environment). For more info, see the docs[2] below.
> +  # other programs (like /usr/bin/prove --exec using for launching
> +  # test suite). For more info, see the docs[2] below.
>   #
>   # These environment variables are used by FFI machinery to find
>   # the proper shared library, hence we can still tweak testing
>   # environment before calling <ffi.load>. However, the value
>   # can't be passed via the standard environment variable, so we
> -  # prepend TEST_ prefix to its name to get around SIP magic
> -  # tricks. Finally, to set the variable required by FFI machinery
> -  # the introduced <utils.tweakenv> routine is used.
> +  # use env call in prove's --exec flag value to get around SIP
> +  # magic tricks.
>   #
>   # [1]: https://support.apple.com/en-us/HT204899
>   # [2]: https://developer.apple.com/library/archive/documentation/Security/Conceptual/System_Integrity_Protection_Guide/RuntimeProtections/RuntimeProtections.html
> -  list(APPEND LUA_TEST_ENV TEST_DYLD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
> +  list(APPEND LUA_TEST_ENV_MORE DYLD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
> else()
> -  list(APPEND LUA_TEST_ENV LD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
> +  list(APPEND LUA_TEST_ENV_MORE LD_LIBRARY_PATH="${LD_LIBRARY_PATH}")
> endif()
> 
> # LUA_CPATH and LD_LIBRARY_PATH variables and also TESTLIBS list
> @@ -120,10 +119,14 @@ add_custom_target(tarantool-tests
> add_custom_command(TARGET tarantool-tests
>   COMMENT "Running Tarantool tests"
>   COMMAND
> +  # XXX: We can't move everything to the "inner" env, since there
> +  # are some issues with escaping ';' for different shells. As
> +  # a result LUA_PATH/LUA_CPATH variables are set via the "outer"
> +  # env, since they are not stripped by SIP like LD_*/DYLD_* are.
>   env
>     ${LUA_TEST_ENV}
>     ${PROVE} ${CMAKE_CURRENT_SOURCE_DIR}
> -      --exec '${LUAJIT_TEST_COMMAND}'
> +      --exec 'env ${LUA_TEST_ENV_MORE} ${LUAJIT_TEST_COMMAND}'
>       --ext ${LUA_TEST_SUFFIX}
>       ${LUA_TEST_FLAGS}
>   WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
> diff --git a/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua b/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua
> index 3e6801a5..dd02130c 100644
> --- a/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua
> +++ b/test/tarantool-tests/gh-4427-ffi-sandwich.test.lua
> @@ -3,10 +3,6 @@ local utils = require('utils')
> -- Disabled on *BSD due to #4819.
> utils.skipcond(jit.os == 'BSD', 'Disabled due to #4819')
> 
> --- XXX: Tweak the process environment to get around SIP.
> --- See the comment in suite CMakeLists.txt for more info.
> -utils.tweakenv(jit.os == 'OSX', 'DYLD_LIBRARY_PATH')
> -
> utils.selfrun(arg, {
>   {
>     arg = {
> diff --git a/test/tarantool-tests/lj-flush-on-trace.test.lua b/test/tarantool-tests/lj-flush-on-trace.test.lua
> index 81fd6e8f..c46b93f0 100644
> --- a/test/tarantool-tests/lj-flush-on-trace.test.lua
> +++ b/test/tarantool-tests/lj-flush-on-trace.test.lua
> @@ -3,10 +3,6 @@ local utils = require('utils')
> -- Disabled on *BSD due to #4819.
> utils.skipcond(jit.os == 'BSD', 'Disabled due to #4819')
> 
> --- XXX: Tweak the process environment to get around SIP.
> --- See the comment in suite CMakeLists.txt for more info.
> -utils.tweakenv(jit.os == 'OSX', 'DYLD_LIBRARY_PATH')
> -
> utils.selfrun(arg, {
>   {
>     arg = {
> diff --git a/test/tarantool-tests/utils.lua b/test/tarantool-tests/utils.lua
> index 5bd42b30..3b809515 100644
> --- a/test/tarantool-tests/utils.lua
> +++ b/test/tarantool-tests/utils.lua
> @@ -1,13 +1,8 @@
> local M = {}
> 
> -local ffi = require('ffi')
> local tap = require('tap')
> local bc = require('jit.bc')
> 
> -ffi.cdef([[
> -  int setenv(const char *name, const char *value, int overwrite);
> -]])
> -
> local function luacmd(args)
>   -- arg[-1] is guaranteed to be not nil.
>   local idx = -2
> @@ -78,18 +73,6 @@ function M.skipcond(condition, message)
>   os.exit(test:check() and 0 or 1)
> end
> 
> -function M.tweakenv(condition, variable)
> -  if not condition or os.getenv(variable) then return end
> -  local testvar = assert(os.getenv('TEST_' .. variable),
> -                         ('Neither %s nor auxiliary TEST_%s variables are set')
> -                         :format(variable, variable))
> -  -- XXX: The third argument of setenv(3) is set to zero to forbid
> -  -- overwriting the <variable>. Since there is the check above
> -  -- whether this <variable> is set in the process environment, it
> -  -- just makes this solution foolproof.
> -  ffi.C.setenv(variable, testvar, 0)
> -end
> -
> function M.hasbc(f, bytecode)
>   assert(type(f) == 'function', 'argument #1 should be a function')
>   assert(type(bytecode) == 'string', 'argument #2 should be a string')
> -- 
> 2.33.1
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions Sergey Kaplun via Tarantool-patches
@ 2022-06-02  8:51   ` sergos via Tarantool-patches
  2022-06-28 19:05     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 10+ messages in thread
From: sergos via Tarantool-patches @ 2022-06-02  8:51 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Hi!

Thanks for the patch.
Some minor updates to the wording, test updates requested.

Regards,
Sergos

> On 9 Dec 2021, at 13:24, Sergey Kaplun <skaplun@tarantool.org> wrote:
> 
> From: Mike Pall <mike>
> 
> (cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
> 
> The arm64 parameters passing rules for Homogeneous Floating-point
> Aggregates (HFA) are the following [1]:
> * If the argument type is an HFA or an HVA, then the argument is used

There’s no explanation of HVA here, unlike HFA. Should it help?

>  unmodified.
> * If the argument is an HFA and there are sufficient unallocated
>  Floating-point registers, then the argument is allocated to
>  Floating-pointRegisters (with one register per member of the HFA).
> 
> Also, if the argument type is a Composite Type then the size of the
> argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
> backend makes this rounding unconditionally for arm64 architecture and
> uses the result value to determine the necessary amount of registers
> for the call. So for the HFA composed of 2n + 1 (< 4) float members the

I suppose you mean 32bit FP value here. 

> extra one register is supposed as occupied. This leads to the wrong
> value (0 due to the corresponding memset in `cconv_struct_init()`) in
> this register if the other one parameter is passed to the procedure.

If I read it correct: “the HFA compliance zero-filled padding  
can be referred to as a next argument passed”?

> 
> This patch fixes this case by using the real size of structure in the
> calculation of necessary amount of registers to use.
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> [1]: https://developer.arm.com/documentation/ihi0055/b/

The link is 404. ARM is known for shuffling the docs, dunno how to pin it.

> 
> Part of tarantool/tarantool#6548
> ---
> OK, there is an important note before you proceed with the patch:
> 
> `isfp` variable in arm64 may takes the following values:
> 0 - for the pointer argument
> 1 - each part of the argument (i.e. field in a structure or floating
>    point number itself) takes exactly one register
> 2 - the structure is HFA (including complex floats) and compact, i.e.
>    two fields may be saved into one register
> 
> Patch fixes the behaviour in the last case, when the structure is HFA
> and takes 8*n + 4 bytes. The variable can't take another value exept
> mentioned before, IINM (I've checked it several times). So the magic

I don’t see any tests using half precision FP, say _Float16. It is available
since GCC 7 at least. It can help with more variations in the size of the
HFA - say 2 or 6.
 
> `d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no
> idea why Mike's prefered such interesting way to fix the behaviour in
> this case... May be he has its own branch with different `isfp` values
> and this is necessary for compatibility.
> 
> Side note: I choose a general name (without mentioning arm64) for this C
> "library" in purpose. It may be adjusted in the future for brute force
> testing of FFI calling conventions for different architectures.
> See also: https://github.com/facebookarchive/luaffifb/blob/master/test.lua
> This link has an interesting example of FFI tests. May be we can create
> something similar with autogeneration of functions, structures and
> tests (not right now, but later).
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
> Tarantool branch: https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
> 
> src/lj_ccall.c                                |  3 +-
> test/tarantool-tests/CMakeLists.txt           |  1 +
> .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
> test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
> test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
> 5 files changed, 97 insertions(+), 1 deletion(-)
> create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
> create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
> create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c
> 
> diff --git a/src/lj_ccall.c b/src/lj_ccall.c
> index f26f1fea..d39ff861 100644
> --- a/src/lj_ccall.c
> +++ b/src/lj_ccall.c
> @@ -337,7 +337,8 @@
>   if (LJ_TARGET_OSX && isva) { \
>     /* IOS: All variadic arguments are on the stack. */ \
>   } else if (isfp) {  /* Try to pass argument in FPRs. */ \
> -    int n2 = ctype_isvector(d->info) ? 1 : n*isfp; \
> +    int n2 = ctype_isvector(d->info) ? 1 : \
> +	     isfp == 1 ? n : (d->size >> (4-isfp)); \
>     if (nfpr + n2 <= CCALL_NARG_FPR) { \
>       dp = &cc->fpr[nfpr]; \
>       nfpr += n2; \
> diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt
> index 992556d9..50769cda 100644
> --- a/test/tarantool-tests/CMakeLists.txt
> +++ b/test/tarantool-tests/CMakeLists.txt
> @@ -56,6 +56,7 @@ macro(BuildTestCLib lib sources)
>   set(LD_LIBRARY_PATH "${CMAKE_CURRENT_BINARY_DIR}:${LD_LIBRARY_PATH}" PARENT_SCOPE)
> endmacro()
> 
> +add_subdirectory(ffi-ccall)
> add_subdirectory(gh-4427-ffi-sandwich)
> add_subdirectory(gh-6098-fix-side-exit-patching-on-arm64)
> add_subdirectory(gh-6189-cur_L)
> diff --git a/test/tarantool-tests/arm64-ccall-fp-convention.test.lua b/test/tarantool-tests/arm64-ccall-fp-convention.test.lua
> new file mode 100644
> index 00000000..c0d2236a
> --- /dev/null
> +++ b/test/tarantool-tests/arm64-ccall-fp-convention.test.lua
> @@ -0,0 +1,65 @@
> +local ffi = require('ffi')
> +local tap = require('tap')
> +
> +local ffi_ccall = ffi.load('libfficcall')
> +
> +local test = tap.test('ffi c call convention')
> +test:plan(3)
> +
> +ffi.cdef[[
> +struct sz12_t {
> +	float f1;
> +	float f2;
> +	float f3;
> +};
> +
> +/* Just return the value itself. */
> +struct sz12_t retsz12(struct sz12_t a);
> +
> +/* Return sum of two. */
> +struct sz12_t add2sz12(struct sz12_t a, struct sz12_t b);
> +
> +/* Return sum of three. */
> +struct sz12_t add3sz12(struct sz12_t a, struct sz12_t b, struct sz12_t c);
> +]]
> +
> +-- For pretty error messages.
> +local sz12_mt = {__tostring = function(sz12)
> +  return string.format('{%d,%d,%d}', sz12.f1, sz12.f2, sz12.f3)
> +end}
> +
> +local sz12_t = ffi.metatype('struct sz12_t', sz12_mt)
> +
> +local function new_sz12(f1, f2, f3)
> +  return sz12_t(f1, f2, f3)
> +end
> +
> +-- For pretty error messages.
> +local expected_mt = {__tostring = function(t)
> +  return '{'..table.concat(t, ',')..'}'
> +end}
> +
> +local function test_sz12(got, expected)
> +  assert(type(got) == 'cdata')
> +  assert(type(expected) == 'table')
> +  expected = setmetatable(expected, expected_mt)
> +  for i = 1, #expected do
> +    if got['f'..i] ~= expected[i] then
> +      error(('bad sz12_t value: got %s, expected %s'):format(
> +        tostring(got), tostring(expected)
> +      ))
> +    end
> +  end
> +  return true
> +end
> +
> +-- Test arm64 calling convention for HFA structures.
> +-- Need structure which size is not multiple by 8.
> +local sz12_111 = new_sz12(1, 1, 1)
> +test:ok(test_sz12(sz12_111, {1, 1, 1}), 'base')
> +local sz12_222 = ffi_ccall.add2sz12(sz12_111, sz12_111)
> +test:ok(test_sz12(sz12_222, {2, 2, 2}), '2 structures as args')
> +local sz12_333 = ffi_ccall.add3sz12(sz12_111, sz12_111, sz12_111)
> +test:ok(test_sz12(sz12_333, {3, 3, 3}), '3 structures as args')
> +
> +os.exit(test:check() and 0 or 1)
> diff --git a/test/tarantool-tests/ffi-ccall/CMakeLists.txt b/test/tarantool-tests/ffi-ccall/CMakeLists.txt
> new file mode 100644
> index 00000000..df937bf8
> --- /dev/null
> +++ b/test/tarantool-tests/ffi-ccall/CMakeLists.txt
> @@ -0,0 +1 @@
> +BuildTestCLib(libfficcall libfficcall.c)
> diff --git a/test/tarantool-tests/ffi-ccall/libfficcall.c b/test/tarantool-tests/ffi-ccall/libfficcall.c
> new file mode 100644
> index 00000000..0c043a68
> --- /dev/null
> +++ b/test/tarantool-tests/ffi-ccall/libfficcall.c
> @@ -0,0 +1,28 @@
> +struct sz12_t {
> +	float f1;
> +	float f2;
> +	float f3;
> +};
> +
> +struct sz12_t retsz12(struct sz12_t a)
> +{
> +	return a;
> +}
> +
> +struct sz12_t add2sz12(struct sz12_t a, struct sz12_t b)
> +{
> +	struct sz12_t res = {0};
> +	res.f1 = a.f1 + b.f1;
> +	res.f2 = a.f2 + b.f2;
> +	res.f3 = a.f3 + b.f3;
> +	return res;
> +}
> +
> +struct sz12_t add3sz12(struct sz12_t a, struct sz12_t b, struct sz12_t c)
> +{
> +	struct sz12_t res = {0};
> +	res.f1 = a.f1 + b.f1 + c.f1;
> +	res.f2 = a.f2 + b.f2 + c.f2;
> +	res.f3 = a.f3 + b.f3 + c.f3;
> +	return res;
> +}
> -- 
> 2.33.1
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.
  2022-06-02  8:51   ` sergos via Tarantool-patches
@ 2022-06-28 19:05     ` Sergey Kaplun via Tarantool-patches
  2022-06-28 19:47       ` Sergey Ostanevich via Tarantool-patches
  2022-06-29  8:45       ` Igor Munkin via Tarantool-patches
  0 siblings, 2 replies; 10+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-06-28 19:05 UTC (permalink / raw)
  To: sergos; +Cc: tarantool-patches

Hi, Sergos!

Thanks for the review!

I've updated commit message to the following:

===================================================================
FFI/ARM64: Fix pass-by-value struct calling conventions.

(cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)

If the argument type is a Composite Type then the size of the
argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
backend makes this rounding unconditionally for arm64 architecture and
uses the result value to determine the necessary amount of registers
for the call.

The arm64 parameters passing rules for Homogeneous Floating-point
Aggregates (HFA) are the following [1]:
| If the argument is an HFA and there are sufficient unallocated
| Floating-point registers, then the argument is allocated to
| Floating-point registers (with one register per member of the HFA).

So for the HFA composed of 3 32-bit float members the extra one (4th)
register is supposed as occupied.

When the second parameter passed to the function these fields are loaded
starting from 5th register. As far as the procedure to call uses 6
registers according to the ABI it leads to the wrong value of the second
parameter passed to the callee (0 due to the corresponding memset in
`cconv_struct_init()`).

This patch fixes this case by using the real size of structure in the
calculation of necessary amount of registers to use.

Sergey Kaplun:
* added the description and the test for the problem

[1]: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parameter-passing-rules

Part of tarantool/tarantool#6548
===================================================================

On 02.06.22, sergos wrote:
> Hi!
> 
> Thanks for the patch.
> Some minor updates to the wording, test updates requested.
> 
> Regards,
> Sergos
> 
> > On 9 Dec 2021, at 13:24, Sergey Kaplun <skaplun@tarantool.org> wrote:
> > 
> > From: Mike Pall <mike>
> > 
> > (cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
> > 
> > The arm64 parameters passing rules for Homogeneous Floating-point
> > Aggregates (HFA) are the following [1]:
> > * If the argument type is an HFA or an HVA, then the argument is used
> 
> There’s no explanation of HVA here, unlike HFA. Should it help?
> 
> >  unmodified.
> > * If the argument is an HFA and there are sufficient unallocated
> >  Floating-point registers, then the argument is allocated to
> >  Floating-pointRegisters (with one register per member of the HFA).
> > 
> > Also, if the argument type is a Composite Type then the size of the
> > argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
> > backend makes this rounding unconditionally for arm64 architecture and
> > uses the result value to determine the necessary amount of registers
> > for the call. So for the HFA composed of 2n + 1 (< 4) float members the
> 
> I suppose you mean 32bit FP value here. 
> 
> > extra one register is supposed as occupied. This leads to the wrong
> > value (0 due to the corresponding memset in `cconv_struct_init()`) in
> > this register if the other one parameter is passed to the procedure.
> 
> If I read it correct: “the HFA compliance zero-filled padding  
> can be referred to as a next argument passed”?
> 
> > 
> > This patch fixes this case by using the real size of structure in the
> > calculation of necessary amount of registers to use.
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > [1]: https://developer.arm.com/documentation/ihi0055/b/
> 
> The link is 404. ARM is known for shuffling the docs, dunno how to pin it.
> 
> > 
> > Part of tarantool/tarantool#6548
> > ---
> > OK, there is an important note before you proceed with the patch:
> > 
> > `isfp` variable in arm64 may takes the following values:
> > 0 - for the pointer argument
> > 1 - each part of the argument (i.e. field in a structure or floating
> >    point number itself) takes exactly one register
> > 2 - the structure is HFA (including complex floats) and compact, i.e.
> >    two fields may be saved into one register
> > 
> > Patch fixes the behaviour in the last case, when the structure is HFA
> > and takes 8*n + 4 bytes. The variable can't take another value exept
> > mentioned before, IINM (I've checked it several times). So the magic
> 
> I don’t see any tests using half precision FP, say _Float16. It is available
> since GCC 7 at least. It can help with more variations in the size of the
> HFA - say 2 or 6.

Yes, LuaJIT FFI interface doesn't know about them :). This kind of
HFA structures is NIY in LuaJIT.

>  
> > `d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no
> > idea why Mike's prefered such interesting way to fix the behaviour in
> > this case... May be he has its own branch with different `isfp` values
> > and this is necessary for compatibility.
> > 
> > Side note: I choose a general name (without mentioning arm64) for this C
> > "library" in purpose. It may be adjusted in the future for brute force
> > testing of FFI calling conventions for different architectures.
> > See also: https://github.com/facebookarchive/luaffifb/blob/master/test.lua
> > This link has an interesting example of FFI tests. May be we can create
> > something similar with autogeneration of functions, structures and
> > tests (not right now, but later).
> > 
> > Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
> > Tarantool branch: https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
> > 
> > src/lj_ccall.c                                |  3 +-
> > test/tarantool-tests/CMakeLists.txt           |  1 +
> > .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
> > test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
> > test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
> > 5 files changed, 97 insertions(+), 1 deletion(-)
> > create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
> > create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
> > create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c
> > 

<snipped>

> > -- 
> > 2.33.1
> > 
> 

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.
  2022-06-28 19:05     ` Sergey Kaplun via Tarantool-patches
@ 2022-06-28 19:47       ` Sergey Ostanevich via Tarantool-patches
  2022-06-29  8:45       ` Igor Munkin via Tarantool-patches
  1 sibling, 0 replies; 10+ messages in thread
From: Sergey Ostanevich via Tarantool-patches @ 2022-06-28 19:47 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 6551 bytes --]


Thanks! 

LGTM.

Best regards,
Sergos 

Tuesday, 28 June 2022, 22:07 +0300 from Kaplun Sergey  <skaplun@tarantool.org>:
>Hi, Sergos!
>
>Thanks for the review!
>
>I've updated commit message to the following:
>
>===================================================================
>FFI/ARM64: Fix pass-by-value struct calling conventions.
>
>(cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>
>If the argument type is a Composite Type then the size of the
>argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>backend makes this rounding unconditionally for arm64 architecture and
>uses the result value to determine the necessary amount of registers
>for the call.
>
>The arm64 parameters passing rules for Homogeneous Floating-point
>Aggregates (HFA) are the following [1]:
>| If the argument is an HFA and there are sufficient unallocated
>| Floating-point registers, then the argument is allocated to
>| Floating-point registers (with one register per member of the HFA).
>
>So for the HFA composed of 3 32-bit float members the extra one (4th)
>register is supposed as occupied.
>
>When the second parameter passed to the function these fields are loaded
>starting from 5th register. As far as the procedure to call uses 6
>registers according to the ABI it leads to the wrong value of the second
>parameter passed to the callee (0 due to the corresponding memset in
>`cconv_struct_init()`).
>
>This patch fixes this case by using the real size of structure in the
>calculation of necessary amount of registers to use.
>
>Sergey Kaplun:
>* added the description and the test for the problem
>
>[1]:  https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parameter-passing-rules
>
>Part of tarantool/tarantool#6548
>===================================================================
>
>On 02.06.22, sergos wrote:
>> Hi!
>> 
>> Thanks for the patch.
>> Some minor updates to the wording, test updates requested.
>> 
>> Regards,
>> Sergos
>> 
>> > On 9 Dec 2021, at 13:24, Sergey Kaplun < skaplun@tarantool.org > wrote:
>> > 
>> > From: Mike Pall <mike>
>> > 
>> > (cherry picked from commit 9143e86498436892cb4316550be4d45b68a61224)
>> > 
>> > The arm64 parameters passing rules for Homogeneous Floating-point
>> > Aggregates (HFA) are the following [1]:
>> > * If the argument type is an HFA or an HVA, then the argument is used
>> 
>> There’s no explanation of HVA here, unlike HFA. Should it help?
>> 
>> >  unmodified.
>> > * If the argument is an HFA and there are sufficient unallocated
>> >  Floating-point registers, then the argument is allocated to
>> >  Floating-pointRegisters (with one register per member of the HFA).
>> > 
>> > Also, if the argument type is a Composite Type then the size of the
>> > argument is rounded up to the nearest multiple of 8 bytes. LuaJIT FFI
>> > backend makes this rounding unconditionally for arm64 architecture and
>> > uses the result value to determine the necessary amount of registers
>> > for the call. So for the HFA composed of 2n + 1 (< 4) float members the
>> 
>> I suppose you mean 32bit FP value here. 
>> 
>> > extra one register is supposed as occupied. This leads to the wrong
>> > value (0 due to the corresponding memset in `cconv_struct_init()`) in
>> > this register if the other one parameter is passed to the procedure.
>> 
>> If I read it correct: “the HFA compliance zero-filled padding 
>> can be referred to as a next argument passed”?
>> 
>> > 
>> > This patch fixes this case by using the real size of structure in the
>> > calculation of necessary amount of registers to use.
>> > 
>> > Sergey Kaplun:
>> > * added the description and the test for the problem
>> > 
>> > [1]:  https://developer.arm.com/documentation/ihi0055/b/
>> 
>> The link is 404. ARM is known for shuffling the docs, dunno how to pin it.
>> 
>> > 
>> > Part of tarantool/tarantool#6548
>> > ---
>> > OK, there is an important note before you proceed with the patch:
>> > 
>> > `isfp` variable in arm64 may takes the following values:
>> > 0 - for the pointer argument
>> > 1 - each part of the argument (i.e. field in a structure or floating
>> >    point number itself) takes exactly one register
>> > 2 - the structure is HFA (including complex floats) and compact, i.e.
>> >    two fields may be saved into one register
>> > 
>> > Patch fixes the behaviour in the last case, when the structure is HFA
>> > and takes 8*n + 4 bytes. The variable can't take another value exept
>> > mentioned before, IINM (I've checked it several times). So the magic
>> 
>> I don’t see any tests using half precision FP, say _Float16. It is available
>> since GCC 7 at least. It can help with more variations in the size of the
>> HFA - say 2 or 6.
>
>Yes, LuaJIT FFI interface doesn't know about them :). This kind of
>HFA structures is NIY in LuaJIT.
>
>> 
>> > `d->size >> (4-isfp)` is exactly `d->size/sizeof(float)`. I have no
>> > idea why Mike's prefered such interesting way to fix the behaviour in
>> > this case... May be he has its own branch with different `isfp` values
>> > and this is necessary for compatibility.
>> > 
>> > Side note: I choose a general name (without mentioning arm64) for this C
>> > "library" in purpose. It may be adjusted in the future for brute force
>> > testing of FFI calling conventions for different architectures.
>> > See also:  https://github.com/facebookarchive/luaffifb/blob/master/test.lua
>> > This link has an interesting example of FFI tests. May be we can create
>> > something similar with autogeneration of functions, structures and
>> > tests (not right now, but later).
>> > 
>> > Branch:  https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> > Tarantool branch:  https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
>> > 
>> > src/lj_ccall.c                                |  3 +-
>> > test/tarantool-tests/CMakeLists.txt           |  1 +
>> > .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
>> > test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
>> > test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
>> > 5 files changed, 97 insertions(+), 1 deletion(-)
>> > create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
>> > create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
>> > create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c
>> > 
>
><snipped>
>
>> > -- 
>> > 2.33.1
>> > 
>> 
>
>-- 
>Best regards,
>Sergey Kaplun

[-- Attachment #2: Type: text/html, Size: 10641 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
  2022-06-02  8:51   ` sergos via Tarantool-patches
@ 2022-06-29  8:31   ` Igor Munkin via Tarantool-patches
  1 sibling, 0 replies; 10+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2022-06-29  8:31 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the patch! Glad to see ugly hacks are moving out from the
repo! LGTM.

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions.
  2022-06-28 19:05     ` Sergey Kaplun via Tarantool-patches
  2022-06-28 19:47       ` Sergey Ostanevich via Tarantool-patches
@ 2022-06-29  8:45       ` Igor Munkin via Tarantool-patches
  1 sibling, 0 replies; 10+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2022-06-29  8:45 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

Thanks for the patch! LGTM after all fixes.

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA
  2021-12-09 10:24 [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Sergey Kaplun via Tarantool-patches
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
  2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions Sergey Kaplun via Tarantool-patches
@ 2022-06-30 12:11 ` Igor Munkin via Tarantool-patches
  2 siblings, 0 replies; 10+ messages in thread
From: Igor Munkin via Tarantool-patches @ 2022-06-30 12:11 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

Sergey,

I've checked the patches into all long-term branches in tarantool/luajit
and bumped a new version in master, 2.10 and 1.10 (only the first one).

On 09.12.21, Sergey Kaplun wrote:
> The first patch is necessary to be able to run tests using `ffi.load`
> without selfrun. We need just to use the set environment variable
> itself. And the second patch is an actual fix.
> 
> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
> Tarantool branch: https://github.com/tarantool/luajit/tree/tarantool/gh-noticket-arm64-ffi-ccall-fp-convention-full-ci
> 
> Mike Pall (1):
>   FFI/ARM64: Fix pass-by-value struct calling conventions.
> 
> Sergey Kaplun (1):
>   test: set DYLD_LIBRARY_PATH environment variable
> 
>  src/lj_ccall.c                                |  3 +-
>  test/tarantool-tests/CMakeLists.txt           | 20 +++---
>  .../arm64-ccall-fp-convention.test.lua        | 65 +++++++++++++++++++
>  test/tarantool-tests/ffi-ccall/CMakeLists.txt |  1 +
>  test/tarantool-tests/ffi-ccall/libfficcall.c  | 28 ++++++++
>  .../gh-4427-ffi-sandwich.test.lua             |  4 --
>  .../lj-flush-on-trace.test.lua                |  4 --
>  test/tarantool-tests/utils.lua                | 17 -----
>  8 files changed, 108 insertions(+), 34 deletions(-)
>  create mode 100644 test/tarantool-tests/arm64-ccall-fp-convention.test.lua
>  create mode 100644 test/tarantool-tests/ffi-ccall/CMakeLists.txt
>  create mode 100644 test/tarantool-tests/ffi-ccall/libfficcall.c
> 
> -- 
> 2.33.1
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-06-30 12:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-09 10:24 [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Sergey Kaplun via Tarantool-patches
2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 1/2] test: set DYLD_LIBRARY_PATH environment variable Sergey Kaplun via Tarantool-patches
2022-06-02  8:51   ` sergos via Tarantool-patches
2022-06-29  8:31   ` Igor Munkin via Tarantool-patches
2021-12-09 10:24 ` [Tarantool-patches] [PATCH luajit 2/2] FFI/ARM64: Fix pass-by-value struct calling conventions Sergey Kaplun via Tarantool-patches
2022-06-02  8:51   ` sergos via Tarantool-patches
2022-06-28 19:05     ` Sergey Kaplun via Tarantool-patches
2022-06-28 19:47       ` Sergey Ostanevich via Tarantool-patches
2022-06-29  8:45       ` Igor Munkin via Tarantool-patches
2022-06-30 12:11 ` [Tarantool-patches] [PATCH luajit 0/2] FFI: fix arm64 call for HFA Igor Munkin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox