From: Sergey Bronnikov via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Sergey Kaplun <skaplun@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix LDP/STP fusion (again). Date: Mon, 8 Sep 2025 13:40:48 +0300 [thread overview] Message-ID: <b4618a2d-1333-437f-83d0-b5001b5ded8e@tarantool.org> (raw) In-Reply-To: <aL6mWeT8VgXNsq75@root> [-- Attachment #1: Type: text/plain, Size: 4813 bytes --] LGTM On 9/8/25 12:48, Sergey Kaplun wrote: > Hi, Sergey! > Thanks for the review! > Fixed your comment and force-pushed the branch. > > On 08.09.25, Sergey Bronnikov wrote: >> Hi, Sergey, >> >> thanks for the patch! LGTM with two minor comments >> >> Sergey >> > <snipped> > >>> On 8/27/25 12:17, Sergey Kaplun wrote: >>>> From: Mike Pall <mike> >>>> >>>> Reported and analyzed by Zhongwei Yao. Fix by Peter Cawley. >>>> >>>> (cherry picked from commit b8c6ccd50c61b7a2df5123ddc5a85ac7d089542b) >>>> >>>> Assume we have stores/loads from the pointer with offset +488 and -16. >>>> The lower bits of the offset are the same as for the offset (488 + 8). >>>> This leads to the incorrect fusion of these instructions: >>>> | str x20, [x21, 488] >>>> | stur x20, [x21, -16] >>>> to the following instruction: >>>> | stp x20, x20, [x21, 488] >>>> >>>> This patch prevents this fusion by more accurate offset comparison. >>>> >>>> Sergey Kaplun: >>>> * added the description and the test for the problem >>>> >>>> Part of tarantool/tarantool#11691 >>>> --- >>>> >>>> Branch:https://github.com/tarantool/luajit/tree/skaplun/lj-1075-arm64-incorrect-ldp-stp-fusion >>>> Related issues: >>>> *https://github.com/tarantool/tarantool/issues/11691 >>>> *https://github.com/LuaJIT/LuaJIT/issues/1075 >>>> >>>> src/lj_emit_arm64.h | 17 ++- >>>> ...75-arm64-incorrect-ldp-stp-fusion.test.lua | 129 ++++++++++++++++++ >>>> 2 files changed, 142 insertions(+), 4 deletions(-) >>>> create mode 100644 test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua >>>> >>>> diff --git a/src/lj_emit_arm64.h b/src/lj_emit_arm64.h >>>> index 5c1bc372..9dd92c40 100644 >>>> --- a/src/lj_emit_arm64.h >>>> +++ b/src/lj_emit_arm64.h > <snipped> > >>>> diff --git a/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua >>>> new file mode 100644 >>>> index 00000000..c84c3b23 >>>> --- /dev/null >>>> +++ b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua >>>> @@ -0,0 +1,129 @@ > <snipped> > >>>> + >>>> +jit.opt.start('hotloop=2') >> Why 2? It deserves a comment, because usually we use 1 hotloop. > It's a copy-pasting mistake from the aarch64 machine, fixed to > `hotloop=1`, thanks: Thanks! > > =================================================================== > diff --git a/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua > index c84c3b23..393a1aa7 100644 > --- a/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua > +++ b/test/tarantool-tests/lj-1075-arm64-incorrect-ldp-stp-fusion.test.lua > @@ -40,7 +40,7 @@ local function init_buf() > end > end > > -jit.opt.start('hotloop=2') > +jit.opt.start('hotloop=1') > > -- Assume we have stores/loads from the pointer with offset > -- +488 and -16. The lower 7 bits of the offset (-16) >> 2 are > =================================================================== > > <snipped> > >>>> + >>>> +-- Another reproducer that is based on the snapshot restoring. >>>> +-- Its advantage is avoiding FFI usage. >>>> + >>>> +-- Snapshot slots are restored in the reversed order. >>>> +-- The recording order is the following (from the bottom of the >>>> +-- trace to the top): >>>> +-- - 0th (ofs == -16) -- `f64()` replaced the `tail64()` on the >>>> +-- stack, >>>> +-- - 63rd (ofs == 488) -- 1, >>>> +-- - 64th (ofs == 496) -- 2. >>>> +-- At recording, the instructions for the 0th and 63rd slots are >>>> +-- merged like the following: >>>> +-- | str x3, [x19, #496] >>>> +-- | stp x2, x1, [x19, #488] >>>> +-- The first store is dominated by the stp, so the restored value >>>> +-- is incorrect. >>>> + >>>> +-- Function with 63 slots on the stack. >>>> +local function f63() >> Minor: Hardcode a number of slots to the function name looks odd. > It is mentioned above why exactly this amount of slots is required. > It shouldn't be touched. The question was about hard-coding a number in a function name, not about using exactly this number of slots. Ok, I'll not insist, as I said in a question. >> The same for tail63. Bumping a number of slots will >> >> require renaming of two functions. >> >> Feel free to ignore. > Ignoring. > >>>> + -- 61 unused slots to avoid extra stores in between. >>>> + -- luacheck: no unused >>>> + local _, _, _, _, _, _, _, _, _, _ >>>> + local _, _, _, _, _, _, _, _, _, _ >>>> + local _, _, _, _, _, _, _, _, _, _ >>>> + local _, _, _, _, _, _, _, _, _, _ >>>> + local _, _, _, _, _, _, _, _, _, _ >>>> + local _, _, _, _, _, _, _, _, _, _ >>>> + local _ >>>> + return 1, 2 >>>> +end >>>> + > <snipped> > >>>> +test:done(true) [-- Attachment #2: Type: text/html, Size: 7680 bytes --]
prev parent reply other threads:[~2025-09-08 10:40 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2025-08-27 9:17 Sergey Kaplun via Tarantool-patches 2025-09-08 8:54 ` Sergey Bronnikov via Tarantool-patches 2025-09-08 9:18 ` Sergey Kaplun via Tarantool-patches 2025-09-08 9:26 ` Sergey Bronnikov via Tarantool-patches 2025-09-08 9:48 ` Sergey Kaplun via Tarantool-patches 2025-09-08 10:40 ` Sergey Bronnikov via Tarantool-patches [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=b4618a2d-1333-437f-83d0-b5001b5ded8e@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=sergeyb@tarantool.org \ --cc=skaplun@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH luajit] ARM64: Fix LDP/STP fusion (again).' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox