From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Sergey Bronnikov <sergeyb@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix LDP/STP fusion (again). Date: Mon, 8 Sep 2025 12:48:09 +0300 [thread overview] Message-ID: <aL6mWeT8VgXNsq75@root> (raw) In-Reply-To: <fa085e96-8056-444f-9134-5287db011005@tarantool.org> Hi, Sergey! Thanks for the review! Fixed your comment and force-pushed the branch. On 08.09.25, Sergey Bronnikov wrote: > Hi, Sergey, > > thanks for the patch! LGTM with two minor comments > > Sergey > <snipped> > > On 8/27/25 12:17, Sergey Kaplun wrote: > >> From: Mike Pall <mike> > >> > >> Reported and analyzed by Zhongwei Yao. Fix by Peter Cawley. > >> > >> (cherry picked from commit b8c6ccd50c61b7a2df5123ddc5a85ac7d089542b) > >> > >> Assume we have stores/loads from the pointer with offset +488 and -16. > >> The lower bits of the offset are the same as for the offset (488 + 8). > >> This leads to the incorrect fusion of these instructions: > >> | str x20, [x21, 488] > >> | stur x20, [x21, -16] > >> to the following instruction: > >> | stp x20, x20, [x21, 488] > >> > >> This patch prevents this fusion by more accurate offset comparison. > >> > >> Sergey Kaplun: > >> * added the description and the test for the problem > >> > >> Part of tarantool/tarantool#11691 > >> --- > >> > >> Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1075-arm64-incorrect-ldp-stp-fusion > >> Related issues: > >> *https://github.com/tarantool/tarantool/issues/11691 > >> *https://github.com/LuaJIT/LuaJIT/issues/1075 > >> > >> src/lj_emit_arm64.h | 17 ++- > >> ...75-arm64-incorrect-ldp-stp-fusion.test.lua | 129 ++++++++++++++++++ > >> 2 files changed, 142 insertions(+), 4 deletions(-) > >> create mode 100644 test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua > >> > >> diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h > >> index 5c1bc372..9dd92c40 100644 > >> --- a/src/lj_emit_arm64.h > >> +++ b/src/lj_emit_arm64.h <snipped> > >> diff --git a/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua > >> new file mode 100644 > >> index 00000000..c84c3b23 > >> --- /dev/null > >> +++ b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua > >> @@ -0,0 +1,129 @@ <snipped> > >> + > >> +jit.opt.start('hotloop=2') > > Why 2? It deserves a comment, because usually we use 1 hotloop. It's a copy-pasting mistake from the aarch64 machine, fixed to `hotloop=1`, thanks: =================================================================== diff --git a/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua index c84c3b23..393a1aa7 100644 --- a/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua +++ b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua @@ -40,7 +40,7 @@ local function init_buf() end end -jit.opt.start('hotloop=2') +jit.opt.start('hotloop=1') -- Assume we have stores/loads from the pointer with offset -- +488 and -16. The lower 7 bits of the offset (-16) >> 2 are =================================================================== > <snipped> > >> + > >> +-- Another reproducer that is based on the snapshot restoring. > >> +-- Its advantage is avoiding FFI usage. > >> + > >> +-- Snapshot slots are restored in the reversed order. > >> +-- The recording order is the following (from the bottom of the > >> +-- trace to the top): > >> +-- - 0th (ofs == -16) -- `f64()` replaced the `tail64()` on the > >> +-- stack, > >> +-- - 63rd (ofs == 488) -- 1, > >> +-- - 64th (ofs == 496) -- 2. > >> +-- At recording, the instructions for the 0th and 63rd slots are > >> +-- merged like the following: > >> +-- | str x3, [x19, #496] > >> +-- | stp x2, x1, [x19, #488] > >> +-- The first store is dominated by the stp, so the restored value > >> +-- is incorrect. > >> + > >> +-- Function with 63 slots on the stack. > >> +local function f63() > > Minor: Hardcode a number of slots to the function name looks odd. It is mentioned above why exactly this amount of slots is required. It shouldn't be touched. > > The same for tail63. Bumping a number of slots will > > require renaming of two functions. > > Feel free to ignore. Ignoring. > > >> + -- 61 unused slots to avoid extra stores in between. > >> + -- luacheck: no unused > >> + local _, _, _, _, _, _, _, _, _, _ > >> + local _, _, _, _, _, _, _, _, _, _ > >> + local _, _, _, _, _, _, _, _, _, _ > >> + local _, _, _, _, _, _, _, _, _, _ > >> + local _, _, _, _, _, _, _, _, _, _ > >> + local _, _, _, _, _, _, _, _, _, _ > >> + local _ > >> + return 1, 2 > >> +end > >> + <snipped> > >> +test:done(true) -- Best regards, Sergey Kaplun
next prev parent reply other threads:[~2025-09-08 9:47 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-08-27 9:17 Sergey Kaplun via Tarantool-patches 2025-09-08 8:54 ` Sergey Bronnikov via Tarantool-patches 2025-09-08 9:18 ` Sergey Kaplun via Tarantool-patches 2025-09-08 9:26 ` Sergey Bronnikov via Tarantool-patches 2025-09-08 9:48 ` Sergey Kaplun via Tarantool-patches [this message] 2025-09-08 10:40 ` Sergey Bronnikov via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=aL6mWeT8VgXNsq75@root \ --to=tarantool-patches@dev.tarantool.org \ --cc=sergeyb@tarantool.org \ --cc=skaplun@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix LDP/STP fusion (again).' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox