From: Vladimir Davydov <vdavydov.dev@gmail.com> To: Kirill Shcherbatov <kshcherbatov@tarantool.org> Cc: tarantool-patches@freelists.org Subject: Re: [PATCH v9 5/6] box: introduce offset_slot cache in key_part Date: Mon, 4 Feb 2019 18:10:59 +0300 [thread overview] Message-ID: <20190204151059.vf7ptiyi2nczsp6w@esperanza> (raw) In-Reply-To: <709ad0bac8976ed78bcf0ce418567b2e3a378a77.1549187339.git.kshcherbatov@tarantool.org> On Sun, Feb 03, 2019 at 01:20:25PM +0300, Kirill Shcherbatov wrote: > tuple_field_by_part looks up the tuple_field corresponding to the > given key part in tuple_format in order to quickly retrieve the offset > of indexed data from the tuple field map. For regular indexes this > operation is blazing fast, however of JSON indexes it is not as we > have to parse the path to data and then do multiple lookups in a JSON > tree. Since tuple_field_by_part is used by comparators, we should > strive to make this routine as fast as possible for all kinds of > indexes. > > This patch introduces an optimization that is supposed to make > tuple_field_by_part for JSON indexes as fast as it is for regular > indexes in most cases. We do that by caching the offset slot right in > key_part. There's a catch here however - we create a new format > whenever an index is dropped or created and we don't reindex old > tuples. As a result, there may be several generations of tuples in the > same space, all using different formats while there's the only key_def > used for comparison. > > To overcome this problem, we introduce the notion of tuple_format > epoch. This is a counter incremented each time a new format is > created. We store it in tuple_format and key_def, and we only use > the offset slot cached in a key_def if it's epoch coincides with the > epoch of the tuple format. If they don't, we look up a tuple_field as > before, and then update the cached value provided the epoch of the > tuple format. > > Part of #1012 > --- > src/box/key_def.c | 15 ++++++++++----- > src/box/key_def.h | 14 ++++++++++++++ > src/box/tuple.h | 2 +- > src/box/tuple_format.c | 6 +++++- > src/box/tuple_format.h | 41 +++++++++++++++++++++++++++++++++++------ > 5 files changed, 65 insertions(+), 13 deletions(-) Pushed to 2.1 with the following minor changes: diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c index fc152cbb..2d9b71ee 100644 --- a/src/box/tuple_format.c +++ b/src/box/tuple_format.c @@ -674,7 +674,6 @@ tuple_format_reuse(struct tuple_format **p_format) tuple_format_destroy(format); free(format); *p_format = *entry; - (*p_format)->epoch = ++formats_epoch; return true; } return false; @@ -738,12 +737,12 @@ tuple_format_new(struct tuple_format_vtab *vtab, void *engine, format->is_temporary = is_temporary; format->is_ephemeral = is_ephemeral; format->exact_field_count = exact_field_count; + format->epoch = ++formats_epoch; if (tuple_format_create(format, keys, key_count, space_fields, space_field_count) < 0) goto err; if (tuple_format_reuse(&format)) return format; - format->epoch = ++formats_epoch; if (tuple_format_register(format) < 0) goto err; if (tuple_format_add_to_hash(format) < 0) { diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h index aedd3e91..01ed97ae 100644 --- a/src/box/tuple_format.h +++ b/src/box/tuple_format.h @@ -137,12 +137,6 @@ tuple_field_is_nullable(struct tuple_field *tuple_field) * Tuple format describes how tuple is stored and information about its fields */ struct tuple_format { - /** - * Counter that grows incrementally on space rebuild - * used for caching offset slot in key_part, for more - * details see key_part::offset_slot_cache. - */ - uint64_t epoch; /** Virtual function table */ struct tuple_format_vtab vtab; /** Pointer to engine-specific data. */ @@ -155,6 +149,12 @@ struct tuple_format { * ephemeral spaces. */ uint32_t hash; + /** + * Counter that grows incrementally on space rebuild + * used for caching offset slot in key_part, for more + * details see key_part::offset_slot_cache. + */ + uint64_t epoch; /** Reference counter */ int refs; /**
next prev parent reply other threads:[~2019-02-04 15:10 UTC|newest] Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov 2019-02-03 10:20 ` [PATCH v9 1/6] lib: update msgpuck library Kirill Shcherbatov 2019-02-04 9:48 ` Vladimir Davydov 2019-02-03 10:20 ` [PATCH v9 2/6] box: introduce tuple_field_raw_by_path routine Kirill Shcherbatov 2019-02-04 10:37 ` Vladimir Davydov 2019-02-03 10:20 ` [PATCH v9 3/6] box: introduce JSON Indexes Kirill Shcherbatov 2019-02-04 12:26 ` Vladimir Davydov 2019-02-03 10:20 ` [PATCH v9 4/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov 2019-02-04 12:31 ` Vladimir Davydov 2019-02-03 10:20 ` [PATCH v9 5/6] box: introduce offset_slot cache in key_part Kirill Shcherbatov 2019-02-04 12:56 ` Vladimir Davydov 2019-02-04 13:02 ` [tarantool-patches] " Kirill Shcherbatov 2019-02-04 15:10 ` Vladimir Davydov [this message] 2019-02-03 10:20 ` [PATCH v9 6/6] box: specify indexes in user-friendly form Kirill Shcherbatov 2019-02-04 15:30 ` Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190204151059.vf7ptiyi2nczsp6w@esperanza \ --to=vdavydov.dev@gmail.com \ --cc=kshcherbatov@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: [PATCH v9 5/6] box: introduce offset_slot cache in key_part' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox