Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Serge Petrenko <sergepetrenko@tarantool.org>, korablev@tarantool.org
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH 2/2] box: use tuple_field_raw_by_part where possible
Date: Tue, 1 Dec 2020 23:04:16 +0100	[thread overview]
Message-ID: <952223de-fd5d-9048-d23b-31eb5b2820a7@tarantool.org> (raw)
In-Reply-To: <c3c532fb-620e-493c-ccd6-833bbfe8830d@tarantool.org>

>> Could be done with templates relatively trivial. We just make path
>> check a boolean template argument, and check it before checking the
>> path.
>>
>> Or we can literally split this function into 2 independent functions,
>> because it seems they will look quite different.
>>>>> Since both tuple_field_raw_by_part and tuple_field_raw are basically
>>>>> aliases to tuple_field_raw_by_path, prefer the former to the latter,
>>>>> since it takes advantage over key part's offset slot cache.
>>>> Besides, I can't find any info why do we need this 'offset slot cache'.
>>>> The comment above this member simply narrates the code, and I don't
>>>> understand the actual reason it was added. Did you manage to realize it?
>>>> Can you explain, please?
>>> Yep. This cache ameliorates the performance loss for looking the field up
>>> in JSON tree every time we need to access some tuple field by offset.
>>>
>>> Say, each time you call tuple_compare_slowpath and it gets to comparing
>>> field values, you'll look up the field in JSON tree to know its offset slot.
>>>
>>> You may have lots of calls to tuple_compare for different tuples but for
>>> the same set of key_parts. So instead of looking the field up on every
>>> comparison, you look it up once and remember its offset slot in the
>>> key part. Now next time you need offset slot access you don't need
>>> to look for the field in the JSON tree at all.
>>>
>>> Looks like initially this cache was intended to save us from some complex
>>> tree lookups, when path is actually not null.
>>>
>>> But the cache helps even for top-level field access. It helps skip not only
>>> the JSON tree lookup, but all the additional checks coming after it.
>>>
>>> So that's why I made use of it in this patch.
>> There must be a problem with these offset slots in case of space alter. If
>> we add a new index or remove another index, the existing tuples will have
>> 2 formats in one index. Some tuples will have the old format, and some will
>> have the new one. With probably different offset slots for the same index
>> parts. Or not, but it does not matter - format->epoch will change anyway.
>>
>> AFAIU, such a simple alter can make the offset slots almost not used, because
>> tuple_field_raw_by_part will reset them on each switch from tuple of one
>> format to a tuple of another format.
>>
>> Maybe we don't want to optimize it though, since alter is not a common case.
> 
> 
> I didn't think of that. Aren't tuples updating their format on the fly?
> Or you probably have to replace the tuple for its format to be updated?

No, tuple never changes its format. The only way is to replace the
old tuples.

  reply	other threads:[~2020-12-01 22:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-14 17:28 [Tarantool-patches] [PATCH 0/2] reduce performance degradation introduced by JSON path indices Serge Petrenko
2020-11-14 17:28 ` [Tarantool-patches] [PATCH 1/2] box: speed up tuple_field_map_create Serge Petrenko
2020-11-14 18:00   ` Cyrill Gorcunov
2020-11-16  7:34     ` Serge Petrenko
2020-11-20 23:26   ` Vladislav Shpilevoy
2020-11-23 13:50     ` Serge Petrenko
2020-11-24 22:33       ` Vladislav Shpilevoy
2020-12-01  8:48         ` Serge Petrenko
2020-12-01 22:04           ` Vladislav Shpilevoy
2020-12-02  9:53             ` Serge Petrenko
2020-11-14 17:28 ` [Tarantool-patches] [PATCH 2/2] box: use tuple_field_raw_by_part where possible Serge Petrenko
2020-11-20 23:26   ` Vladislav Shpilevoy
2020-11-23 13:51     ` Serge Petrenko
2020-11-24 22:33       ` Vladislav Shpilevoy
2020-12-01  8:48         ` Serge Petrenko
2020-12-01 22:04           ` Vladislav Shpilevoy [this message]
2020-11-16  7:50 ` [Tarantool-patches] [PATCH 0/2] reduce performance degradation introduced by JSON path indices Serge Petrenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=952223de-fd5d-9048-d23b-31eb5b2820a7@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=korablev@tarantool.org \
    --cc=sergepetrenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 2/2] box: use tuple_field_raw_by_part where possible' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox