From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp43.i.mail.ru (smtp43.i.mail.ru [94.100.177.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D4077469710 for ; Fri, 29 May 2020 02:23:31 +0300 (MSK) References: <048601d6352c$3cdb81a0$b69284e0$@tarantool.org> From: Vladislav Shpilevoy Message-ID: <05c14f7d-6f18-05ba-19ad-b5aa4dc91d17@tarantool.org> Date: Fri, 29 May 2020 01:23:29 +0200 MIME-Version: 1.0 In-Reply-To: <048601d6352c$3cdb81a0$b69284e0$@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v2 04/10] crc32: align memory access List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Timur Safin , tarantool-patches@dev.tarantool.org, alyapunov@tarantool.org, korablev@tarantool.org Thanks for the review! On 28/05/2020 22:11, Timur Safin wrote: > > > : -----Original Message----- > : From: Vladislav Shpilevoy > : Subject: [PATCH v2 04/10] crc32: align memory access > : > : > : diff --git a/src/cpu_feature.c b/src/cpu_feature.c > : index 98567ccb3..9bf6223de 100644 > : --- a/src/cpu_feature.c > : +++ b/src/cpu_feature.c > : @@ -50,7 +51,7 @@ > : > : > : static uint32_t > : -crc32c_hw_byte(uint32_t crc, unsigned char const *data, unsigned int > : length) > : +crc32c_hw_byte(uint32_t crc, char const *data, unsigned int length) > : { > : while (length--) { > : __asm__ __volatile__( > : @@ -68,6 +69,26 @@ crc32c_hw_byte(uint32_t crc, unsigned char const *data, > : unsigned int length) > : uint32_t > : crc32c_hw(uint32_t crc, const char *buf, unsigned int len) > : { > : + const int align = alignof(unsigned long); > : + unsigned long addr = (unsigned long)buf; > : + unsigned int not_aligned_prefix = > : + ((addr - 1 + align) & ~(align - 1)) - addr; > > Hmm, hmm... > > Isn't it simple `addr % align`? Or even `addr & (align - 1)` ? Consider the example: addr = 14, align = 8. Then not_aligned_prefix = 2. Need to read first 2 bytes one by one to get to 16, the closest aligned address. addr % align = 14 % 8 = 6 != 2 addr & (align - 1) = 14 & 7 = 1110 & 0111 = 110 = 6 != 2 But yeah, this could be done simpler: align - addr % align. This will give how many bytes are needed to the next aligned address. I wrote the solution above by blindly using 'aligned - not aligned' and the same schema as in small_align(). Here is the diff: ==================== diff --git a/src/cpu_feature.c b/src/cpu_feature.c index 9bf6223de..7b284fa98 100644 --- a/src/cpu_feature.c +++ b/src/cpu_feature.c @@ -69,10 +69,8 @@ crc32c_hw_byte(uint32_t crc, char const *data, unsigned int length) uint32_t crc32c_hw(uint32_t crc, const char *buf, unsigned int len) { - const int align = alignof(unsigned long); - unsigned long addr = (unsigned long)buf; - unsigned int not_aligned_prefix = - ((addr - 1 + align) & ~(align - 1)) - addr; + const unsigned int align = alignof(unsigned long); + unsigned int not_aligned_prefix = align - (unsigned long)buf % align; /* * Calculate CRC32 for the prefix byte-by-byte so as to * then use aligned words to calculate the rest. This is