From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id ACC0445C304 for ; Wed, 2 Dec 2020 01:04:18 +0300 (MSK) References: <20201114172823.8217-1-sergepetrenko@tarantool.org> <20201114172823.8217-3-sergepetrenko@tarantool.org> <5689a42c-1207-2259-9dca-c290ccce0849@tarantool.org> <7c6dd0b0-bf3a-7fbe-3116-3df3d23882b7@tarantool.org> From: Vladislav Shpilevoy Message-ID: <952223de-fd5d-9048-d23b-31eb5b2820a7@tarantool.org> Date: Tue, 1 Dec 2020 23:04:16 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH 2/2] box: use tuple_field_raw_by_part where possible List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko , korablev@tarantool.org Cc: tarantool-patches@dev.tarantool.org >> Could be done with templates relatively trivial. We just make path >> check a boolean template argument, and check it before checking the >> path. >> >> Or we can literally split this function into 2 independent functions, >> because it seems they will look quite different. >>>>> Since both tuple_field_raw_by_part and tuple_field_raw are basically >>>>> aliases to tuple_field_raw_by_path, prefer the former to the latter, >>>>> since it takes advantage over key part's offset slot cache. >>>> Besides, I can't find any info why do we need this 'offset slot cache'. >>>> The comment above this member simply narrates the code, and I don't >>>> understand the actual reason it was added. Did you manage to realize it? >>>> Can you explain, please? >>> Yep. This cache ameliorates the performance loss for looking the field up >>> in JSON tree every time we need to access some tuple field by offset. >>> >>> Say, each time you call tuple_compare_slowpath and it gets to comparing >>> field values, you'll look up the field in JSON tree to know its offset slot. >>> >>> You may have lots of calls to tuple_compare for different tuples but for >>> the same set of key_parts. So instead of looking the field up on every >>> comparison, you look it up once and remember its offset slot in the >>> key part. Now next time you need offset slot access you don't need >>> to look for the field in the JSON tree at all. >>> >>> Looks like initially this cache was intended to save us from some complex >>> tree lookups, when path is actually not null. >>> >>> But the cache helps even for top-level field access. It helps skip not only >>> the JSON tree lookup, but all the additional checks coming after it. >>> >>> So that's why I made use of it in this patch. >> There must be a problem with these offset slots in case of space alter. If >> we add a new index or remove another index, the existing tuples will have >> 2 formats in one index. Some tuples will have the old format, and some will >> have the new one. With probably different offset slots for the same index >> parts. Or not, but it does not matter - format->epoch will change anyway. >> >> AFAIU, such a simple alter can make the offset slots almost not used, because >> tuple_field_raw_by_part will reset them on each switch from tuple of one >> format to a tuple of another format. >> >> Maybe we don't want to optimize it though, since alter is not a common case. > > > I didn't think of that. Aren't tuples updating their format on the fly? > Or you probably have to replace the tuple for its format to be updated? No, tuple never changes its format. The only way is to replace the old tuples.