From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 048B6464E99; Thu, 16 Jan 2025 15:47:42 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 048B6464E99 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1737031662; bh=+TyhpoxMcIIclNE4B+UA7wtA4VU57FvGUw6HWpCGxlI=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=unV3FibFwkMnkCOP0l6a4Zwnrijsmfgxh7aeNtPKrHksdGpYXRot1PzNZn9b3r3Rb NnD/Hyl5JoIGhXZoEbVEy4ioE3RQcUbAVfzlgrx9PvA3DXDx6hv6XeaYt8OCPoyz58 9Dgdg22IYUnHzxdHRrUg0XfZpygiwC9TJ0xa71k4= Received: from send127.i.mail.ru (send127.i.mail.ru [89.221.237.222]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 26A53464E99 for ; Thu, 16 Jan 2025 15:47:40 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 26A53464E99 Received: by exim-smtp-6758d5575c-xwrgb with esmtpa (envelope-from ) id 1tYPHe-000000003Vn-41MV; Thu, 16 Jan 2025 15:47:39 +0300 Content-Type: multipart/alternative; boundary="------------c1HLJ0Jn0fLMeQZwQjXLaWBB" Message-ID: Date: Thu, 16 Jan 2025 15:47:37 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Sergey Kaplun Cc: tarantool-patches References: Content-Language: en-US In-Reply-To: X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9CAF828D4DCE9EB9516BB6BEA42E0431EF98641455DECBCC2182A05F53808504053F78CC4638E8AC53DE06ABAFEAF67056F0F4EC19BA697B6AB2A63B0BBACF930C8A1163FE0F36968 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE727FD6E7FC3A8F857EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637FD60A286D0BA57028638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8DF005FE3FB4E7B60E8CA393129B94B38C2E8AF38BC761133CC7F00164DA146DAFE8445B8C89999728AA50765F7900637CAEE156C82D3D7D9389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC8744B801E316CB65FF6B57BC7E6449061A352F6E88A58FB86F5D81C698A659EA73AA81AA40904B5D9A18204E546F3947C2D01283D1ACF37BA302FCEF25BFAB3454AD6D5ED66289B523666184CF4C3C14F6136E347CC761E07725E5C173C3A84C31DD302B0B79430F8BA3038C0950A5D36B5C8C57E37DE458B330BD67F2E7D9AF16D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE7994FE22CF3C16DE0731C566533BA786AA5CC5B56E945C8DA X-C1DE0DAB: 0D63561A33F958A58CEB449F28057C8E5002B1117B3ED696C404EBCDD8C24B485B6221DB6D7A72AD823CB91A9FED034534781492E4B8EEAD05E80F4396618BB2BDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742424CF958EAFF5D571004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D34AAC3D1FDB34D048837E473E75D3223B1367990078C19560A06CFDAC77F218044291B1F780497C4BB1D7E09C32AA3244C6117BF1FC4325BF577DD89D51EBB7742572D5D4A0486B00FEA455F16B58544A2E30DDF7C44BCB90DA5AE236DF995FB59978A700BF655EAEEED6A17656DB59BCAD427812AF56FC65B X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojyistvkELW9oz+VaPLy7sLg== X-Mailru-Sender: 520A125C2F17F0B1E52FEF5D219D6140BB0631FF912821DEA6D5EE0DB6E1EC8DCFC32A2848BF52DF0152A3D17938EB451EB5A0BCEC6A560B3DDE9B364B0DF289BE2DA36745F2EEB5CEBA01FB949A1F1EEAB4BC95F72C04283CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit 1/2] Cleanup CPU detection and tuning for old CPUs. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------c1HLJ0Jn0fLMeQZwQjXLaWBB Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi, Sergey, Thanks for the fixes! LGTM On 15.01.2025 16:10, Sergey Kaplun wrote: > Hi, Sergey! > Thanks for the review! > Updated the commit message and force-pushed the branch. > > On 14.01.25, Sergey Bronnikov wrote: >> Hi, Sergey, >> >> thanks for the patch! >> >> LGTM with a minor comment >> >> Sergey >> >> On 13.01.2025 18:17, Sergey Kaplun wrote: >>> From: Mike Pall >>> >>> (cherry picked from commit 0eddcbead2d67c16dcd4039a6765b9d2fc8ea631) >>> >>> This patch does the following refactoring: >>> 1) Drops optimizations for the Intel Atom CPU [1]: removes the >>> `JIT_F_LEA_AGU` flag and related optimizations. The considerations >>> for the use of LEA are complex and very CPU-specific, mostly >>> dependent on the number of operands. Mostly, it isn't worth it due to >>> the extra register pressure and/or extra instructions. >> I would say explicitly that `JIT_F_LEA_AGU` is used in "Well, yes, that >> applies to the original and obsolete Atom architecture. Today "Intel >> Atom" is just a trade name for reduced-performance implementations of >> the current Intel architecture." >> >> as Mike explained in LUAJIT#24. So there are no any risks for tarantool >> users >> >> regarding performance degradation. > Added, as you suggested. The new commit message is the following: > > | Cleanup CPU detection and tuning for old CPUs. > | > | (cherry picked from commit 0eddcbead2d67c16dcd4039a6765b9d2fc8ea631) > | > | This patch does the following refactoring: > | 1) Drops optimizations for the Intel Atom CPU [1]: removes the > | `JIT_F_LEA_AGU` flag and related optimizations. The considerations > | for the use of LEA are complex and very CPU-specific, mostly > | dependent on the number of operands. Mostly, it isn't worth it due to > | the extra register pressure and/or extra instructions. > | Be aware that it applies to the original and obsolete Atom > | architecture. Today "Intel Atom" is just a trade name for > | reduced-performance implementations of the current Intel > | architecture. > | 2) Drops optimizations for the AMD K8, K10 CPU [2][3]: removes the > | `JIT_F_PREFER_IMUL` flag and related optimizations. > | 3) Refactors JIT flags defined in the . Now all CPU-specific > | JIT flags are defined as the left shift of `JIT_F_CPU` instead of > | hardcoded constants, similar for the optimization flags. > | 4) Adds detection of the ARM8 CPU. > | 5) Drops the check for SSE2 since the VM already presumes CPU supports > | it. > | 6) Adds checks for `__ARM_ARCH`[4] macro in . > | 7) Drops outdated comment in the amalgamation file about memory > | requirements. > | > | Sergey Kaplun: > | * added the description for the patch > | > | [1]:https://en.wikipedia.org/wiki/Intel_Atom > | [2]:https://en.wikipedia.org/wiki/AMD_K8 > | [3]:https://en.wikipedia.org/wiki/AMD_K10 > | [4]:https://developer.arm.com/documentation/dui0774/l/Other-Compiler-specific-Features/Predefined-macros > | > | Part of tarantool/tarantool#10709 > >>> 2) Drops optimizations for the AMD K8, K10 CPU [2][3]: removes the >>> `JIT_F_PREFER_IMUL` flag and related optimizations. >>> 3) Refactors JIT flags defined in the . Now all CPU-specific >>> JIT flags are defined as the left shift of `JIT_F_CPU` instead of >>> hardcoded constants, similar for the optimization flags. >>> 4) Adds detection of the ARM8 CPU. >>> 5) Drops the check for SSE2 since the VM already presumes CPU supports >>> it. >>> 6) Adds checks for `__ARM_ARCH`[4] macro in . >>> 7) Drops outdated comment in the amalgamation file about memory >>> requirements. >>> >>> Sergey Kaplun: >>> * added the description for the patch >>> >>> [1]:https://en.wikipedia.org/wiki/Intel_Atom >>> [2]:https://en.wikipedia.org/wiki/AMD_K8 >>> [3]:https://en.wikipedia.org/wiki/AMD_K10 >>> [4]:https://developer.arm.com/documentation/dui0774/l/Other-Compiler-specific-Features/Predefined-macros >>> >>> Part of tarantool/tarantool#10709 >>> --- > > --------------c1HLJ0Jn0fLMeQZwQjXLaWBB Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Hi, Sergey,

Thanks for the fixes! LGTM

On 15.01.2025 16:10, Sergey Kaplun wrote:
Hi, Sergey!
Thanks for the review!
Updated the commit message and force-pushed the branch.

On 14.01.25, Sergey Bronnikov wrote:
Hi, Sergey,

thanks for the patch!

LGTM with a minor comment

Sergey

On 13.01.2025 18:17, Sergey Kaplun wrote:
From: Mike Pall <mike>

(cherry picked from commit 0eddcbead2d67c16dcd4039a6765b9d2fc8ea631)

This patch does the following refactoring:
1) Drops optimizations for the Intel Atom CPU [1]: removes the
    `JIT_F_LEA_AGU` flag and related optimizations. The considerations
    for the use of LEA are complex and very CPU-specific, mostly
    dependent on the number of operands. Mostly, it isn't worth it due to
    the extra register pressure and/or extra instructions.
I would say explicitly that `JIT_F_LEA_AGU` is used in "Well, yes, that 
applies to the original and obsolete Atom architecture. Today "Intel 
Atom" is just a trade name for reduced-performance implementations of 
the current Intel architecture."

as Mike explained in LUAJIT#24. So there are no any risks for tarantool 
users

regarding performance degradation.
Added, as you suggested. The new commit message is the following:

| Cleanup CPU detection and tuning for old CPUs.
|
| (cherry picked from commit 0eddcbead2d67c16dcd4039a6765b9d2fc8ea631)
|
| This patch does the following refactoring:
| 1) Drops optimizations for the Intel Atom CPU [1]: removes the
|    `JIT_F_LEA_AGU` flag and related optimizations. The considerations
|    for the use of LEA are complex and very CPU-specific, mostly
|    dependent on the number of operands. Mostly, it isn't worth it due to
|    the extra register pressure and/or extra instructions.
|    Be aware that it applies to the original and obsolete Atom
|    architecture. Today "Intel Atom" is just a trade name for
|    reduced-performance implementations of the current Intel
|    architecture.
| 2) Drops optimizations for the AMD K8, K10 CPU [2][3]: removes the
|    `JIT_F_PREFER_IMUL` flag and related optimizations.
| 3) Refactors JIT flags defined in the <lj_jit.h>. Now all CPU-specific
|    JIT flags are defined as the left shift of `JIT_F_CPU` instead of
|    hardcoded constants, similar for the optimization flags.
| 4) Adds detection of the ARM8 CPU.
| 5) Drops the check for SSE2 since the VM already presumes CPU supports
|    it.
| 6) Adds checks for `__ARM_ARCH`[4] macro in <lj_arch.h>.
| 7) Drops outdated comment in the amalgamation file about memory
|    requirements.
|
| Sergey Kaplun:
| * added the description for the patch
|
| [1]: https://en.wikipedia.org/wiki/Intel_Atom
| [2]: https://en.wikipedia.org/wiki/AMD_K8
| [3]: https://en.wikipedia.org/wiki/AMD_K10
| [4]: https://developer.arm.com/documentation/dui0774/l/Other-Compiler-specific-Features/Predefined-macros
|
| Part of tarantool/tarantool#10709


        
2) Drops optimizations for the AMD K8, K10 CPU [2][3]: removes the
    `JIT_F_PREFER_IMUL` flag and related optimizations.
3) Refactors JIT flags defined in the <lj_jit.h>. Now all CPU-specific
    JIT flags are defined as the left shift of `JIT_F_CPU` instead of
    hardcoded constants, similar for the optimization flags.
4) Adds detection of the ARM8 CPU.
5) Drops the check for SSE2 since the VM already presumes CPU supports
    it.
6) Adds checks for `__ARM_ARCH`[4] macro in <lj_arch.h>.
7) Drops outdated comment in the amalgamation file about memory
    requirements.

Sergey Kaplun:
* added the description for the patch

[1]:https://en.wikipedia.org/wiki/Intel_Atom
[2]:https://en.wikipedia.org/wiki/AMD_K8
[3]:https://en.wikipedia.org/wiki/AMD_K10
[4]:https://developer.arm.com/documentation/dui0774/l/Other-Compiler-specific-Features/Predefined-macros

Part of tarantool/tarantool#10709
---
<snipped>

--------------c1HLJ0Jn0fLMeQZwQjXLaWBB--