From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp16.mail.ru (smtp16.mail.ru [94.100.176.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 065D8469710 for ; Tue, 19 May 2020 00:17:30 +0300 (MSK) References: <172feb69ec2caea2aa63a24de8b76a4dba13bebb.1589583614.git.v.shpilevoy@tarantool.org> <52af1bb7-b125-7490-3882-ce74698789d2@tarantool.org> From: Vladislav Shpilevoy Message-ID: Date: Mon, 18 May 2020 23:17:28 +0200 MIME-Version: 1.0 In-Reply-To: <52af1bb7-b125-7490-3882-ce74698789d2@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aleksandr Lyapunov , tarantool-patches@dev.tarantool.org, tsafin@tarantool.org, gorcunov@gmail.com Thanks for the review! > On 5/16/20 2:03 AM, Vladislav Shpilevoy wrote: >>   tt_uuid_is_nil(const struct tt_uuid *uu) >>   { >> -    const uint64_t *p = (const uint64_t *) uu; >> -    return !p[0] && !p[1]; >> +    const uint32_t *p = (const uint32_t *) uu; >> +    return p[0] == 0 && p[1] == 0 && p[2] == 0 && p[3] == 0; >>   } >>     /** >> @@ -172,9 +172,10 @@ tt_uuid_is_nil(const struct tt_uuid *uu) >>   inline bool >>   tt_uuid_is_equal(const struct tt_uuid *lhs, const struct tt_uuid *rhs) >>   { >> -    const uint64_t *lp = (const uint64_t *) lhs; >> -    const uint64_t *rp = (const uint64_t *) rhs; >> -    return lp[0] == rp[0] && lp[1] == rp[1]; >> +    const uint32_t *lp = (const uint32_t *) lhs; >> +    const uint32_t *rp = (const uint32_t *) rhs; >> +    return lp[0] == rp[0] && lp[1] == rp[1] && lp[2] == rp[2] && >> +           lp[3] == rp[3]; > > It seems that we degrade performance just for clang to be happy.. Yeah, well. This is the same like saying that we degrade performance when we do OOM checks. Unaligned memory access is UB. This is a bug. > I would suggest to use memcmp in this case. > It's portable and allows a compiler to generate the best possible code. > I've measured it (gcc) and memcmp version is twice faster than your solution. > Even for _is_nil method it's better to use memcmp with statically allocated zero buffer. Could you please show the benchmark? I did my own, and I can't see any significant difference. The present difference is so small, that it looks like jitter. Both in is_nil and is_eq. I did a very simple bench in Lua, without any GCed objects. uuid = require('uuid') uuid1 = uuid.new() uuid2 = uuid.new() clock = require('clock') function bench1() local res = 0 for i = 1, 10000000 do res = res + ((uuid1 == uuid2) and 1 or 0) end return res end function bench2() local res = 0 for i = 1, 100000000 do res = res + ((uuid1:isnil() and uuid2:isnil()) and 1 or 0) end return res end clock.bench(bench2) Bench1 is to check is_eq(). Bench2 is to check is_nil(). The build was Release. Here is what I've got: uuid is eq: 4 x uint32 (my solution): 2.691714 2.727981 2.852932 2.764013 memcmp (your solution): 2.701216 2.731426 2.641924 2.628619 2 x uint64 (old solution): 2.671254 2.644062 2.639914 2.66742 uuid is nil: 4 x uint32: 0.2599 0.256139 0.258356 0.259495 memcmp: 0.26042 0.263186 0.262118 0.261504 2 x uint64: 0.256665 0.259839 0.259746 0.258651