From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp32.i.mail.ru (smtp32.i.mail.ru [94.100.177.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5E03F469710 for ; Tue, 9 Jun 2020 01:33:29 +0300 (MSK) From: Vladislav Shpilevoy References: <048601d6352c$3cdb81a0$b69284e0$@tarantool.org> <05c14f7d-6f18-05ba-19ad-b5aa4dc91d17@tarantool.org> Message-ID: Date: Tue, 9 Jun 2020 00:33:26 +0200 MIME-Version: 1.0 In-Reply-To: <05c14f7d-6f18-05ba-19ad-b5aa4dc91d17@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v2 04/10] crc32: align memory access List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Timur Safin , tarantool-patches@dev.tarantool.org, alyapunov@tarantool.org, korablev@tarantool.org > diff --git a/src/cpu_feature.c b/src/cpu_feature.c > index 9bf6223de..7b284fa98 100644 > --- a/src/cpu_feature.c > +++ b/src/cpu_feature.c > @@ -69,10 +69,8 @@ crc32c_hw_byte(uint32_t crc, char const *data, unsigned int length) > uint32_t > crc32c_hw(uint32_t crc, const char *buf, unsigned int len) > { > - const int align = alignof(unsigned long); > - unsigned long addr = (unsigned long)buf; > - unsigned int not_aligned_prefix = > - ((addr - 1 + align) & ~(align - 1)) - addr; > + const unsigned int align = alignof(unsigned long); > + unsigned int not_aligned_prefix = align - (unsigned long)buf % align; When the address is aligned, not_aligned_prefix becomes = align. For 8 byte word it means we will do 8 operations instead of 1. I fixed it this way: ==================== diff --git a/src/cpu_feature.c b/src/cpu_feature.c index 7b284fa98..856f054c7 100644 --- a/src/cpu_feature.c +++ b/src/cpu_feature.c @@ -70,7 +70,8 @@ uint32_t crc32c_hw(uint32_t crc, const char *buf, unsigned int len) { const unsigned int align = alignof(unsigned long); - unsigned int not_aligned_prefix = align - (unsigned long)buf % align; + unsigned int not_aligned_prefix = + (align - (unsigned long)buf % align) % align; /* * Calculate CRC32 for the prefix byte-by-byte so as to * then use aligned words to calculate the rest. This is ==================== This is fast, because % align is transformed into & (align - 1) in the assembly.