From: Serge Petrenko <sergepetrenko@tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, korablev@tarantool.org Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 2/2] box: use tuple_field_raw_by_part where possible Date: Mon, 23 Nov 2020 16:51:01 +0300 [thread overview] Message-ID: <7c6dd0b0-bf3a-7fbe-3116-3df3d23882b7@tarantool.org> (raw) In-Reply-To: <5689a42c-1207-2259-9dca-c290ccce0849@tarantool.org> 21.11.2020 02:26, Vladislav Shpilevoy пишет: > Thanks for the patch! Thanks for your answer! > On 14.11.2020 18:28, Serge Petrenko wrote: >> tuple_field_raw_by_part allows to use part's offset slot cache to bypass >> tuple field lookup in a JSON tree, which is involved even for plain >> tuples. > I am not sure I understand. How does it work in 1.10 then? It uses > tuple_field_by_part_raw, which in the end simply calls > > tuple_field_raw(format, data, field_map, part->fieldno); > > Exactly what we had here. Why did anything change? tuple_field_raw() itself changed. On 1.10 it accesses field as an array element, then takes its offset slot and finds the shift in the field map. On 2.x it aliases to tuple_field_raw_by_path(): finds the top-level field in the format's JSON tree, does a couple of checks for JSON paths and multikey indices and only then takes the offset like 1.10 did. But while tuple_field_raw_by_path() takes an offset slot hint, tuple_field_raw() doesn't use it, so its call always results in searching the JSON tree. >> Since both tuple_field_raw_by_part and tuple_field_raw are basically >> aliases to tuple_field_raw_by_path, prefer the former to the latter, >> since it takes advantage over key part's offset slot cache. > Besides, I can't find any info why do we need this 'offset slot cache'. > The comment above this member simply narrates the code, and I don't > understand the actual reason it was added. Did you manage to realize it? > Can you explain, please? Yep. This cache ameliorates the performance loss for looking the field up in JSON tree every time we need to access some tuple field by offset. Say, each time you call tuple_compare_slowpath and it gets to comparing field values, you'll look up the field in JSON tree to know its offset slot. You may have lots of calls to tuple_compare for different tuples but for the same set of key_parts. So instead of looking the field up on every comparison, you look it up once and remember its offset slot in the key part. Now next time you need offset slot access you don't need to look for the field in the JSON tree at all. Looks like initially this cache was intended to save us from some complex tree lookups, when path is actually not null. But the cache helps even for top-level field access. It helps skip not only the JSON tree lookup, but all the additional checks coming after it. So that's why I made use of it in this patch. -- Serge Petrenko
next prev parent reply other threads:[~2020-11-23 13:51 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-14 17:28 [Tarantool-patches] [PATCH 0/2] reduce performance degradation introduced by JSON path indices Serge Petrenko 2020-11-14 17:28 ` [Tarantool-patches] [PATCH 1/2] box: speed up tuple_field_map_create Serge Petrenko 2020-11-14 18:00 ` Cyrill Gorcunov 2020-11-16 7:34 ` Serge Petrenko 2020-11-20 23:26 ` Vladislav Shpilevoy 2020-11-23 13:50 ` Serge Petrenko 2020-11-24 22:33 ` Vladislav Shpilevoy 2020-12-01 8:48 ` Serge Petrenko 2020-12-01 22:04 ` Vladislav Shpilevoy 2020-12-02 9:53 ` Serge Petrenko 2020-11-14 17:28 ` [Tarantool-patches] [PATCH 2/2] box: use tuple_field_raw_by_part where possible Serge Petrenko 2020-11-20 23:26 ` Vladislav Shpilevoy 2020-11-23 13:51 ` Serge Petrenko [this message] 2020-11-24 22:33 ` Vladislav Shpilevoy 2020-12-01 8:48 ` Serge Petrenko 2020-12-01 22:04 ` Vladislav Shpilevoy 2020-11-16 7:50 ` [Tarantool-patches] [PATCH 0/2] reduce performance degradation introduced by JSON path indices Serge Petrenko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=7c6dd0b0-bf3a-7fbe-3116-3df3d23882b7@tarantool.org \ --to=sergepetrenko@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 2/2] box: use tuple_field_raw_by_part where possible' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox