* [PATCH v6 0/3] box: introduce hint option for memtx tree index @ 2019-03-13 12:15 Kirill Shcherbatov 2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw) To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov Reworked memtx tree to use 'tuple hints'. Introduced special functions for retrieve tuple hints for a particular key_def. Hint is an integer that can be used for tuple comparison optimization: if a hint of one tuple is less than a hint of another then the first tuple is definitely less than the second; only if hints are equal tuple_compare must be called for getting comparison result. Hints are only useful when: * they are precalculated and stored along with the tuple; calculation is not cheap (cheaper than tuple_compare btw) but precalculated hints allow to compare tuples without even fetching tuple data. * first part of key_def is 'string', 'unsigned' or 'integer' * since hint is calculated using only the first part of key_def (and only first several characters if it is a string) this part must be different with high probability for every tuple pair. Enabled hint option improve performance on average by 15%; Select operations are significantly accelerated (there are scenarios in which the difference reaches 200-250%). Also appended multikey index prototype. I am going to try to rework field_map initialization and resend last letter a bit later. Changes in version 6: -- Changed hint format: now the first two bytes are reserved for type -- Hints for SCALAR type -- All compare routins now have two versions: hinted and non-hinted Changes in version 5: -- Code rewritten without classes and macro definitions using vtabs. -- Appended multikey index prototype. Changes in version 4: -- Code rewritten in C++ with classes. This perform a better maintainability in future. -- New hints for number and boolean types. Memtx Tree is always hinted now. -- INVALID_HINT marker. We need it because double have strange types NaN and so on that musn't be a problem of hints business. -- After part of code was merged, rebased patch. Changes in version 3: -- Better-structured code -- Refactored all memtx indexes to use shared mempool of default size -- Refactored all memtx indexes to hide implementation details from headers -- Moved all hints-related code to corresponding module -- Better types names, better comments -- Fixed potential bug with iterators: introduce MEMTX_TREE_IDENTICAL macro -- Fix inaccurate MEMTX_TREE_ELEM_SET usage in memtx_tree_index_build_next -- Changed approach to calculate string hints -- Introduce separate hint for binary collation type Changes in version 2: -- Splitted patch to parts in other way to decrease diff -- Hints are not index option anymore, but default where possible -- Removed hints for numeric types v5: https://www.freelists.org/post/tarantool-patches/PATCH-v5-04-box-introduce-hint-option-for-memtx-tree-index v4: https://www.freelists.org/post/tarantool-patches/PATCH-v4-07-box-introduce-hint-option-for-memtx-tree-index v3: https://www.freelists.org/post/tarantool-patches/PATCH-v3-07-box-introduce-hint-option-for-memtx-tree-index v2: https://www.freelists.org/post/tarantool-patches/PATCH-v2-04-box-introduce-hint-option-for-memtx-tree-index v1: https://www.freelists.org/post/tarantool-patches/PATCH-v1-04-box-introduce-hint-option-for-memtx-tree-index http://github.com/tarantool/tarantool/tree/kshch/gh-3961-tuple-hints https://github.com/tarantool/tarantool/issues/3961 Kirill Shcherbatov (3): box: refactor key_def_set_compare_func routine memtx: introduce tuple compare hint box: introduce multikey indexes src/box/key_def.c | 15 + src/box/key_def.h | 100 ++++++ src/box/memtx_tree.c | 217 +++++++++++- src/box/tuple.c | 8 +- src/box/tuple.h | 122 ++++++- src/box/tuple_compare.cc | 705 +++++++++++++++++++++++++++++++++++--- src/box/tuple_format.c | 120 +++++-- src/lib/coll/coll.c | 33 ++ src/lib/coll/coll.h | 4 + test/engine/json.result | 80 ++++- test/engine/json.test.lua | 20 ++ 11 files changed, 1330 insertions(+), 94 deletions(-) -- 2.21.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 1/3] box: refactor key_def_set_compare_func routine 2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov @ 2019-03-13 12:15 ` Kirill Shcherbatov 2019-03-14 7:04 ` Vladimir Davydov 2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov 2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov 2 siblings, 1 reply; 11+ messages in thread From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw) To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov Refactored tuple_compare_create and tuple_compare_with_key_create routines to set corresponding key_def field by it's own instead of returning pointer to function. This is required as in further patches we should set two key_def pointers: for hinted and for non-hinted routine version. Needed for #3961 --- src/box/tuple_compare.cc | 62 ++++++++++++++++++++++++---------------- 1 file changed, 38 insertions(+), 24 deletions(-) diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index cf4519ccb..1bcff2ca3 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -1049,21 +1049,26 @@ static const tuple_compare_t compare_slowpath_funcs[] = { tuple_compare_slowpath<true, true, true> }; -static tuple_compare_t -tuple_compare_create(const struct key_def *def) +void +key_def_set_tuple_compare(struct key_def *def) { int cmp_func_idx = (def->is_nullable ? 1 : 0) + 2 * (def->has_optional_parts ? 1 : 0) + 4 * (def->has_json_paths ? 1 : 0); if (def->is_nullable) { if (key_def_is_sequential(def)) { - if (def->has_optional_parts) - return tuple_compare_sequential<true, true>; - else - return tuple_compare_sequential<true, false>; + if (def->has_optional_parts) { + def->tuple_compare = + tuple_compare_sequential<true, true>; + } else { + def->tuple_compare = + tuple_compare_sequential<true, false>; + } } else { - return compare_slowpath_funcs[cmp_func_idx]; + def->tuple_compare = + compare_slowpath_funcs[cmp_func_idx]; } + return; } assert(! def->has_optional_parts); if (!key_def_has_collation(def) && !def->has_json_paths) { @@ -1078,13 +1083,15 @@ tuple_compare_create(const struct key_def *def) cmp_arr[k].p[i * 2 + 1]) break; if (i == def->part_count && - cmp_arr[k].p[i * 2] == UINT32_MAX) - return cmp_arr[k].f; + cmp_arr[k].p[i * 2] == UINT32_MAX) { + def->tuple_compare = cmp_arr[k].f; + return; + } } } - return key_def_is_sequential(def) ? - tuple_compare_sequential<false, false> : - compare_slowpath_funcs[cmp_func_idx]; + def->tuple_compare = key_def_is_sequential(def) ? + tuple_compare_sequential<false, false> : + compare_slowpath_funcs[cmp_func_idx]; } /* }}} tuple_compare */ @@ -1277,8 +1284,8 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = { tuple_compare_with_key_slowpath<true, true, true> }; -static tuple_compare_with_key_t -tuple_compare_with_key_create(const struct key_def *def) +void +key_def_set_tuple_compare_with_key(struct key_def *def) { int cmp_func_idx = (def->is_nullable ? 1 : 0) + 2 * (def->has_optional_parts ? 1 : 0) + @@ -1286,15 +1293,19 @@ tuple_compare_with_key_create(const struct key_def *def) if (def->is_nullable) { if (key_def_is_sequential(def)) { if (def->has_optional_parts) { - return tuple_compare_with_key_sequential<true, + def->tuple_compare_with_key = + tuple_compare_with_key_sequential<true, true>; } else { - return tuple_compare_with_key_sequential<true, + def->tuple_compare_with_key = + tuple_compare_with_key_sequential<true, false>; } } else { - return compare_with_key_slowpath_funcs[cmp_func_idx]; + def->tuple_compare_with_key = + compare_with_key_slowpath_funcs[cmp_func_idx]; } + return; } assert(! def->has_optional_parts); if (!key_def_has_collation(def) && !def->has_json_paths) { @@ -1312,13 +1323,16 @@ tuple_compare_with_key_create(const struct key_def *def) break; } } - if (i == def->part_count) - return cmp_wk_arr[k].f; + if (i == def->part_count) { + def->tuple_compare_with_key = cmp_wk_arr[k].f; + return; + } } } - return key_def_is_sequential(def) ? - tuple_compare_with_key_sequential<false, false> : - compare_with_key_slowpath_funcs[cmp_func_idx]; + def->tuple_compare_with_key = + key_def_is_sequential(def) ? + tuple_compare_with_key_sequential<false, false> : + compare_with_key_slowpath_funcs[cmp_func_idx]; } /* }}} tuple_compare_with_key */ @@ -1326,6 +1340,6 @@ tuple_compare_with_key_create(const struct key_def *def) void key_def_set_compare_func(struct key_def *def) { - def->tuple_compare = tuple_compare_create(def); - def->tuple_compare_with_key = tuple_compare_with_key_create(def); + key_def_set_tuple_compare(def); + key_def_set_tuple_compare_with_key(def); } -- 2.21.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine 2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov @ 2019-03-14 7:04 ` Vladimir Davydov 2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov 0 siblings, 1 reply; 11+ messages in thread From: Vladimir Davydov @ 2019-03-14 7:04 UTC (permalink / raw) To: Kirill Shcherbatov; +Cc: tarantool-patches On Wed, Mar 13, 2019 at 03:15:37PM +0300, Kirill Shcherbatov wrote: > @@ -1326,6 +1340,6 @@ tuple_compare_with_key_create(const struct key_def *def) > void > key_def_set_compare_func(struct key_def *def) > { > - def->tuple_compare = tuple_compare_create(def); > - def->tuple_compare_with_key = tuple_compare_with_key_create(def); > + key_def_set_tuple_compare(def); > + key_def_set_tuple_compare_with_key(def); > } The two functions share a lot of code - all branching is basically the same. I think we better merge them instead. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine 2019-03-14 7:04 ` Vladimir Davydov @ 2019-03-15 10:20 ` Kirill Shcherbatov 2019-03-15 10:55 ` Kirill Shcherbatov 2019-03-19 19:38 ` Vladimir Davydov 0 siblings, 2 replies; 11+ messages in thread From: Kirill Shcherbatov @ 2019-03-15 10:20 UTC (permalink / raw) To: tarantool-patches, Vladimir Davydov Refactored key_def_set_compare_func routine to make it easier to read, maintain, and extend. This is necessary due to the fact that in a series of subsequent patches it will be significantly expanded. Needed for #3961 --- src/box/tuple_compare.cc | 195 +++++++++++++++++++-------------------- 1 file changed, 93 insertions(+), 102 deletions(-) diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index dfe30b190..7fe1766a8 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -986,7 +986,7 @@ struct comparator_signature { /** * field1 no, field1 type, field2 no, field2 type, ... */ -static const comparator_signature cmp_arr[] = { +static const comparator_signature precompiled_cmp_arr[] = { COMPARATOR(0, FIELD_TYPE_UNSIGNED) COMPARATOR(0, FIELD_TYPE_STRING) COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED) @@ -1005,55 +1005,6 @@ static const comparator_signature cmp_arr[] = { #undef COMPARATOR -static const tuple_compare_t compare_slowpath_funcs[] = { - tuple_compare_slowpath<false, false, false>, - tuple_compare_slowpath<true, false, false>, - tuple_compare_slowpath<false, true, false>, - tuple_compare_slowpath<true, true, false>, - tuple_compare_slowpath<false, false, true>, - tuple_compare_slowpath<true, false, true>, - tuple_compare_slowpath<false, true, true>, - tuple_compare_slowpath<true, true, true> -}; - -static tuple_compare_t -tuple_compare_create(const struct key_def *def) -{ - int cmp_func_idx = (def->is_nullable ? 1 : 0) + - 2 * (def->has_optional_parts ? 1 : 0) + - 4 * (def->has_json_paths ? 1 : 0); - if (def->is_nullable) { - if (key_def_is_sequential(def)) { - if (def->has_optional_parts) - return tuple_compare_sequential<true, true>; - else - return tuple_compare_sequential<true, false>; - } else { - return compare_slowpath_funcs[cmp_func_idx]; - } - } - assert(! def->has_optional_parts); - if (!key_def_has_collation(def) && !def->has_json_paths) { - /* Precalculated comparators don't use collation */ - for (uint32_t k = 0; - k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) { - uint32_t i = 0; - for (; i < def->part_count; i++) - if (def->parts[i].fieldno != - cmp_arr[k].p[i * 2] || - def->parts[i].type != - cmp_arr[k].p[i * 2 + 1]) - break; - if (i == def->part_count && - cmp_arr[k].p[i * 2] == UINT32_MAX) - return cmp_arr[k].f; - } - } - return key_def_is_sequential(def) ? - tuple_compare_sequential<false, false> : - compare_slowpath_funcs[cmp_func_idx]; -} - /* }}} tuple_compare */ /* {{{ tuple_compare_with_key */ @@ -1215,7 +1166,7 @@ struct comparator_with_key_signature #define KEY_COMPARATOR(...) \ { TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } }, -static const comparator_with_key_signature cmp_wk_arr[] = { +static const comparator_with_key_signature precompiled_cmp_wk_arr[] = { KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED) KEY_COMPARATOR(0, FIELD_TYPE_STRING , 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED) KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_STRING , 2, FIELD_TYPE_UNSIGNED) @@ -1231,68 +1182,108 @@ static const comparator_with_key_signature cmp_wk_arr[] = { KEY_COMPARATOR(1, FIELD_TYPE_STRING , 2, FIELD_TYPE_STRING) }; -#undef KEY_COMPARATOR - -static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = { - tuple_compare_with_key_slowpath<false, false, false>, - tuple_compare_with_key_slowpath<true, false, false>, - tuple_compare_with_key_slowpath<false, true, false>, - tuple_compare_with_key_slowpath<true, true, false>, - tuple_compare_with_key_slowpath<false, false, true>, - tuple_compare_with_key_slowpath<true, false, true>, - tuple_compare_with_key_slowpath<false, true, true>, - tuple_compare_with_key_slowpath<true, true, true> +#undef KEY_COMPARATORq + +#define COMPARE_SLOWPATH(...) \ + {tuple_compare_slowpath<__VA_ARGS__>, \ + tuple_compare_with_key_slowpath<__VA_ARGS__>, \ + __VA_ARGS__, false} + +#define COMPARE_SEQUENTIAL(...) \ + {tuple_compare_sequential<__VA_ARGS__>, \ + tuple_compare_with_key_sequential<__VA_ARGS__>, \ + __VA_ARGS__, false, true} + +static struct { + tuple_compare_t tuple_compare; + tuple_compare_with_key_t tuple_compare_with_key; + bool is_nullable; + bool has_optional_parts; + bool has_json_paths; + bool is_sequential; +} cmp_arr[] = { + COMPARE_SLOWPATH(false, false, false), + COMPARE_SLOWPATH(true, false, false), + COMPARE_SLOWPATH(false, true, false), + COMPARE_SLOWPATH(true, true, false), + COMPARE_SLOWPATH(false, false, true), + COMPARE_SLOWPATH(true, false, true), + COMPARE_SLOWPATH(false, true, true), + COMPARE_SLOWPATH(true, true, true), + COMPARE_SEQUENTIAL(false, false), + COMPARE_SEQUENTIAL(true, false), + COMPARE_SEQUENTIAL(false, true), + COMPARE_SEQUENTIAL(true, true), }; -static tuple_compare_with_key_t -tuple_compare_with_key_create(const struct key_def *def) +#undef COMPARE_SLOWPATH +#undef COMPARE_SEQUENTIAL + +/* }}} tuple_compare_with_key */ + +void +key_def_set_compare_func(struct key_def *def) { - int cmp_func_idx = (def->is_nullable ? 1 : 0) + - 2 * (def->has_optional_parts ? 1 : 0) + - 4 * (def->has_json_paths ? 1 : 0); - if (def->is_nullable) { - if (key_def_is_sequential(def)) { - if (def->has_optional_parts) { - return tuple_compare_with_key_sequential<true, - true>; - } else { - return tuple_compare_with_key_sequential<true, - false>; + def->tuple_compare = NULL; + def->tuple_compare_with_key = NULL; + if (!def->is_nullable && !key_def_has_collation(def) && + !def->has_json_paths) { + /* Precalculated comparators don't use collation */ + for (uint32_t k = 0; k < sizeof(precompiled_cmp_arr) / + sizeof(precompiled_cmp_arr[0]); k++) { + uint32_t i = 0; + for (; i < def->part_count; i++) { + if (def->parts[i].fieldno != + precompiled_cmp_arr[k].p[i * 2] || + def->parts[i].type != + precompiled_cmp_arr[k].p[i * 2 + 1]) + break; + } + if (i == def->part_count && + precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) { + def->tuple_compare = precompiled_cmp_arr[k].f; + break; } - } else { - return compare_with_key_slowpath_funcs[cmp_func_idx]; } - } - assert(! def->has_optional_parts); - if (!key_def_has_collation(def) && !def->has_json_paths) { - /* Precalculated comparators don't use collation */ - for (uint32_t k = 0; - k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]); - k++) { - + for (uint32_t k = 0; k < sizeof(precompiled_cmp_wk_arr) / + sizeof(precompiled_cmp_wk_arr[0]); k++) { uint32_t i = 0; for (; i < def->part_count; i++) { if (def->parts[i].fieldno != - cmp_wk_arr[k].p[i * 2] || - def->parts[i].type != - cmp_wk_arr[k].p[i * 2 + 1]) { + precompiled_cmp_wk_arr[k].p[i * 2] || + def->parts[i].type != + precompiled_cmp_wk_arr[k].p[i * 2 + 1]) break; + } + if (i == def->part_count) { + def->tuple_compare_with_key = + precompiled_cmp_wk_arr[k].f; + break; + } + } + } + if (def->tuple_compare == NULL || def->tuple_compare_with_key == NULL) { + for (uint32_t k = 0; k < sizeof(cmp_arr) / + sizeof(cmp_arr[0]); k++) { + if (def->is_nullable == cmp_arr[k].is_nullable && + def->has_optional_parts == + cmp_arr[k].has_optional_parts && + def->has_json_paths == cmp_arr[k].has_json_paths && + key_def_is_sequential(def) == + cmp_arr[k].is_sequential) { + if (def->tuple_compare == NULL) { + def->tuple_compare = + cmp_arr[k].tuple_compare; } + if (def->tuple_compare_with_key == NULL) { + def->tuple_compare_with_key = + cmp_arr[k]. + tuple_compare_with_key; + } + break; } - if (i == def->part_count) - return cmp_wk_arr[k].f; } } - return key_def_is_sequential(def) ? - tuple_compare_with_key_sequential<false, false> : - compare_with_key_slowpath_funcs[cmp_func_idx]; -} - -/* }}} tuple_compare_with_key */ - -void -key_def_set_compare_func(struct key_def *def) -{ - def->tuple_compare = tuple_compare_create(def); - def->tuple_compare_with_key = tuple_compare_with_key_create(def); + assert(def->tuple_compare != NULL && + def->tuple_compare_with_key != NULL); } -- 2.21.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine 2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov @ 2019-03-15 10:55 ` Kirill Shcherbatov 2019-03-19 19:38 ` Vladimir Davydov 1 sibling, 0 replies; 11+ messages in thread From: Kirill Shcherbatov @ 2019-03-15 10:55 UTC (permalink / raw) To: tarantool-patches, Vladimir Davydov @@ -1182,7 +1182,7 @@ static const comparator_with_key_signature precompiled_cmp_wk_arr[] = { KEY_COMPARATOR(1, FIELD_TYPE_STRING , 2, FIELD_TYPE_STRING) }; -#undef KEY_COMPARATORq +#undef KEY_COMPARATOR ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine 2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov 2019-03-15 10:55 ` Kirill Shcherbatov @ 2019-03-19 19:38 ` Vladimir Davydov 1 sibling, 0 replies; 11+ messages in thread From: Vladimir Davydov @ 2019-03-19 19:38 UTC (permalink / raw) To: Kirill Shcherbatov; +Cc: tarantool-patches On Fri, Mar 15, 2019 at 01:20:11PM +0300, Kirill Shcherbatov wrote: > +#define COMPARE_SLOWPATH(...) \ > + {tuple_compare_slowpath<__VA_ARGS__>, \ > + tuple_compare_with_key_slowpath<__VA_ARGS__>, \ > + __VA_ARGS__, false} > + > +#define COMPARE_SEQUENTIAL(...) \ > + {tuple_compare_sequential<__VA_ARGS__>, \ > + tuple_compare_with_key_sequential<__VA_ARGS__>, \ > + __VA_ARGS__, false, true} > + > +static struct { > + tuple_compare_t tuple_compare; > + tuple_compare_with_key_t tuple_compare_with_key; > + bool is_nullable; > + bool has_optional_parts; > + bool has_json_paths; > + bool is_sequential; > +} cmp_arr[] = { > + COMPARE_SLOWPATH(false, false, false), > + COMPARE_SLOWPATH(true, false, false), > + COMPARE_SLOWPATH(false, true, false), > + COMPARE_SLOWPATH(true, true, false), > + COMPARE_SLOWPATH(false, false, true), > + COMPARE_SLOWPATH(true, false, true), > + COMPARE_SLOWPATH(false, true, true), > + COMPARE_SLOWPATH(true, true, true), > + COMPARE_SEQUENTIAL(false, false), > + COMPARE_SEQUENTIAL(true, false), > + COMPARE_SEQUENTIAL(false, true), > + COMPARE_SEQUENTIAL(true, true), > }; > > -static tuple_compare_with_key_t > -tuple_compare_with_key_create(const struct key_def *def) > +#undef COMPARE_SLOWPATH > +#undef COMPARE_SEQUENTIAL > + > +/* }}} tuple_compare_with_key */ > + > +void > +key_def_set_compare_func(struct key_def *def) > { > - int cmp_func_idx = (def->is_nullable ? 1 : 0) + > - 2 * (def->has_optional_parts ? 1 : 0) + > - 4 * (def->has_json_paths ? 1 : 0); > - if (def->is_nullable) { > - if (key_def_is_sequential(def)) { > - if (def->has_optional_parts) { > - return tuple_compare_with_key_sequential<true, > - true>; > - } else { > - return tuple_compare_with_key_sequential<true, > - false>; > + def->tuple_compare = NULL; > + def->tuple_compare_with_key = NULL; > + if (!def->is_nullable && !key_def_has_collation(def) && > + !def->has_json_paths) { > + /* Precalculated comparators don't use collation */ > + for (uint32_t k = 0; k < sizeof(precompiled_cmp_arr) / > + sizeof(precompiled_cmp_arr[0]); k++) { > + uint32_t i = 0; > + for (; i < def->part_count; i++) { > + if (def->parts[i].fieldno != > + precompiled_cmp_arr[k].p[i * 2] || > + def->parts[i].type != > + precompiled_cmp_arr[k].p[i * 2 + 1]) > + break; > + } > + if (i == def->part_count && > + precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) { > + def->tuple_compare = precompiled_cmp_arr[k].f; > + break; > } > - } else { > - return compare_with_key_slowpath_funcs[cmp_func_idx]; > } > - } > - assert(! def->has_optional_parts); > - if (!key_def_has_collation(def) && !def->has_json_paths) { > - /* Precalculated comparators don't use collation */ > - for (uint32_t k = 0; > - k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]); > - k++) { > - > + for (uint32_t k = 0; k < sizeof(precompiled_cmp_wk_arr) / > + sizeof(precompiled_cmp_wk_arr[0]); k++) { > uint32_t i = 0; > for (; i < def->part_count; i++) { > if (def->parts[i].fieldno != > - cmp_wk_arr[k].p[i * 2] || > - def->parts[i].type != > - cmp_wk_arr[k].p[i * 2 + 1]) { > + precompiled_cmp_wk_arr[k].p[i * 2] || > + def->parts[i].type != > + precompiled_cmp_wk_arr[k].p[i * 2 + 1]) > break; > + } > + if (i == def->part_count) { > + def->tuple_compare_with_key = > + precompiled_cmp_wk_arr[k].f; > + break; > + } > + } > + } > + if (def->tuple_compare == NULL || def->tuple_compare_with_key == NULL) { > + for (uint32_t k = 0; k < sizeof(cmp_arr) / > + sizeof(cmp_arr[0]); k++) { > + if (def->is_nullable == cmp_arr[k].is_nullable && > + def->has_optional_parts == > + cmp_arr[k].has_optional_parts && > + def->has_json_paths == cmp_arr[k].has_json_paths && > + key_def_is_sequential(def) == > + cmp_arr[k].is_sequential) { > + if (def->tuple_compare == NULL) { > + def->tuple_compare = > + cmp_arr[k].tuple_compare; > } > + if (def->tuple_compare_with_key == NULL) { > + def->tuple_compare_with_key = > + cmp_arr[k]. > + tuple_compare_with_key; > + } > + break; > } > - if (i == def->part_count) > - return cmp_wk_arr[k].f; > } > } All these unnamed booleans and variadic macros and linear search over them look mind-boggling IMO. I understand that this approach is kinda generic, but it just looks ugly. In fact, I don't see any point in such a generalization at this point at all. See, although we'll have four parameters - is_nullable, has_optional_parts, has_json_paths, and is_multikey - they are not independent: - has_optional_parts depends on is_nullable - is_multikey depends on has_json_paths So the number of combination is far not staggering, and I don't see any new parameters coming in the near future. So I dropped this patch and instead split key_def_set_compare_func into sub-functions by key_def types. You will find the patch below. It should be trivial to add hinted comparators and multikey support on top of it without introducing deep if-else hiearchies or excessive code repetition. --- From 829c811cdeb98bd8e5cb91edf2197f0c340d55ed Mon Sep 17 00:00:00 2001 Message-Id: <829c811cdeb98bd8e5cb91edf2197f0c340d55ed.1553024132.git.vdavydov.dev@gmail.com> From: Vladimir Davydov <vdavydov.dev@gmail.com> Date: Tue, 19 Mar 2019 20:27:42 +0300 Subject: [PATCH] tuple_compare: cleanup key_def virtual func setter Over time key_def virtual func setter code has turned into a complete mess: now it's an incomprehensible mix of if-else blocks, linear array search, and devious array indexing. What is worse, it's duplicated by tuple_compare_create and tuple_compare_with_key_create. Adding yet another parameter to key_def templates, which is needed to implement multi-key indexes, is literally impossible. This patch attempts to alleviate the situation by splitting code paths for pre-compiled, plain, and json indexes plus merging tuple_compare and tuple_compare_with_key setters. --- src/box/tuple_compare.cc | 211 +++++++++++++++++++++++------------------------ 1 file changed, 105 insertions(+), 106 deletions(-) diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index dfe30b19..19b396e2 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -1005,55 +1005,6 @@ static const comparator_signature cmp_arr[] = { #undef COMPARATOR -static const tuple_compare_t compare_slowpath_funcs[] = { - tuple_compare_slowpath<false, false, false>, - tuple_compare_slowpath<true, false, false>, - tuple_compare_slowpath<false, true, false>, - tuple_compare_slowpath<true, true, false>, - tuple_compare_slowpath<false, false, true>, - tuple_compare_slowpath<true, false, true>, - tuple_compare_slowpath<false, true, true>, - tuple_compare_slowpath<true, true, true> -}; - -static tuple_compare_t -tuple_compare_create(const struct key_def *def) -{ - int cmp_func_idx = (def->is_nullable ? 1 : 0) + - 2 * (def->has_optional_parts ? 1 : 0) + - 4 * (def->has_json_paths ? 1 : 0); - if (def->is_nullable) { - if (key_def_is_sequential(def)) { - if (def->has_optional_parts) - return tuple_compare_sequential<true, true>; - else - return tuple_compare_sequential<true, false>; - } else { - return compare_slowpath_funcs[cmp_func_idx]; - } - } - assert(! def->has_optional_parts); - if (!key_def_has_collation(def) && !def->has_json_paths) { - /* Precalculated comparators don't use collation */ - for (uint32_t k = 0; - k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) { - uint32_t i = 0; - for (; i < def->part_count; i++) - if (def->parts[i].fieldno != - cmp_arr[k].p[i * 2] || - def->parts[i].type != - cmp_arr[k].p[i * 2 + 1]) - break; - if (i == def->part_count && - cmp_arr[k].p[i * 2] == UINT32_MAX) - return cmp_arr[k].f; - } - } - return key_def_is_sequential(def) ? - tuple_compare_sequential<false, false> : - compare_slowpath_funcs[cmp_func_idx]; -} - /* }}} tuple_compare */ /* {{{ tuple_compare_with_key */ @@ -1233,66 +1184,114 @@ static const comparator_with_key_signature cmp_wk_arr[] = { #undef KEY_COMPARATOR -static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = { - tuple_compare_with_key_slowpath<false, false, false>, - tuple_compare_with_key_slowpath<true, false, false>, - tuple_compare_with_key_slowpath<false, true, false>, - tuple_compare_with_key_slowpath<true, true, false>, - tuple_compare_with_key_slowpath<false, false, true>, - tuple_compare_with_key_slowpath<true, false, true>, - tuple_compare_with_key_slowpath<false, true, true>, - tuple_compare_with_key_slowpath<true, true, true> -}; - -static tuple_compare_with_key_t -tuple_compare_with_key_create(const struct key_def *def) -{ - int cmp_func_idx = (def->is_nullable ? 1 : 0) + - 2 * (def->has_optional_parts ? 1 : 0) + - 4 * (def->has_json_paths ? 1 : 0); - if (def->is_nullable) { - if (key_def_is_sequential(def)) { - if (def->has_optional_parts) { - return tuple_compare_with_key_sequential<true, - true>; - } else { - return tuple_compare_with_key_sequential<true, - false>; - } - } else { - return compare_with_key_slowpath_funcs[cmp_func_idx]; - } - } - assert(! def->has_optional_parts); - if (!key_def_has_collation(def) && !def->has_json_paths) { - /* Precalculated comparators don't use collation */ - for (uint32_t k = 0; - k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]); - k++) { - - uint32_t i = 0; - for (; i < def->part_count; i++) { - if (def->parts[i].fieldno != - cmp_wk_arr[k].p[i * 2] || - def->parts[i].type != - cmp_wk_arr[k].p[i * 2 + 1]) { - break; - } - } - if (i == def->part_count) - return cmp_wk_arr[k].f; - } - } - return key_def_is_sequential(def) ? - tuple_compare_with_key_sequential<false, false> : - compare_with_key_slowpath_funcs[cmp_func_idx]; -} - /* }}} tuple_compare_with_key */ +static void +key_def_set_compare_func_fast(struct key_def *def) +{ + assert(!def->is_nullable); + assert(!def->has_optional_parts); + assert(!def->has_json_paths); + assert(!key_def_has_collation(def)); + + tuple_compare_t cmp = NULL; + tuple_compare_with_key_t cmp_wk = NULL; + bool is_sequential = key_def_is_sequential(def); + + /* + * Use pre-compiled comparators if available, otherwise + * fall back on generic comparators. + */ + for (uint32_t k = 0; k < lengthof(cmp_arr); k++) { + uint32_t i = 0; + for (; i < def->part_count; i++) + if (def->parts[i].fieldno != cmp_arr[k].p[i * 2] || + def->parts[i].type != cmp_arr[k].p[i * 2 + 1]) + break; + if (i == def->part_count && cmp_arr[k].p[i * 2] == UINT32_MAX) { + cmp = cmp_arr[k].f; + break; + } + } + for (uint32_t k = 0; k < lengthof(cmp_wk_arr); k++) { + uint32_t i = 0; + for (; i < def->part_count; i++) { + if (def->parts[i].fieldno != cmp_wk_arr[k].p[i * 2] || + def->parts[i].type != cmp_wk_arr[k].p[i * 2 + 1]) + break; + } + if (i == def->part_count) { + cmp_wk = cmp_wk_arr[k].f; + break; + } + } + if (cmp == NULL) { + cmp = is_sequential ? + tuple_compare_sequential<false, false> : + tuple_compare_slowpath<false, false, false>; + } + if (cmp_wk == NULL) { + cmp_wk = is_sequential ? + tuple_compare_with_key_sequential<false, false> : + tuple_compare_with_key_slowpath<false, false, false>; + } + + def->tuple_compare = cmp; + def->tuple_compare_with_key = cmp_wk; +} + +template<bool is_nullable, bool has_optional_parts> +static void +key_def_set_compare_func_plain(struct key_def *def) +{ + assert(!def->has_json_paths); + if (key_def_is_sequential(def)) { + def->tuple_compare = tuple_compare_sequential + <is_nullable, has_optional_parts>; + def->tuple_compare_with_key = tuple_compare_with_key_sequential + <is_nullable, has_optional_parts>; + } else { + def->tuple_compare = tuple_compare_slowpath + <is_nullable, has_optional_parts, false>; + def->tuple_compare_with_key = tuple_compare_with_key_slowpath + <is_nullable, has_optional_parts, false>; + } +} + +template<bool is_nullable, bool has_optional_parts> +static void +key_def_set_compare_func_json(struct key_def *def) +{ + assert(def->has_json_paths); + def->tuple_compare = tuple_compare_slowpath + <is_nullable, has_optional_parts, true>; + def->tuple_compare_with_key = tuple_compare_with_key_slowpath + <is_nullable, has_optional_parts, true>; +} + void key_def_set_compare_func(struct key_def *def) { - def->tuple_compare = tuple_compare_create(def); - def->tuple_compare_with_key = tuple_compare_with_key_create(def); + if (!key_def_has_collation(def) && + !def->is_nullable && !def->has_json_paths) { + key_def_set_compare_func_fast(def); + } else if (!def->has_json_paths) { + if (def->is_nullable && def->has_optional_parts) { + key_def_set_compare_func_plain<true, true>(def); + } else if (def->is_nullable && !def->has_optional_parts) { + key_def_set_compare_func_plain<true, false>(def); + } else { + assert(!def->is_nullable && !def->has_optional_parts); + key_def_set_compare_func_plain<false, false>(def); + } + } else { + if (def->is_nullable && def->has_optional_parts) { + key_def_set_compare_func_json<true, true>(def); + } else if (def->is_nullable && !def->has_optional_parts) { + key_def_set_compare_func_json<true, false>(def); + } else { + assert(!def->is_nullable && !def->has_optional_parts); + key_def_set_compare_func_json<false, false>(def); + } + } } ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 2/3] memtx: introduce tuple compare hint 2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov 2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov @ 2019-03-13 12:15 ` Kirill Shcherbatov 2019-03-14 8:19 ` Vladimir Davydov 2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov 2 siblings, 1 reply; 11+ messages in thread From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw) To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov Implement functions for retrieving tuple hints for a particular key_def. Hint is an integer that can be used for tuple comparison optimization: if a hint of one tuple is less than a hint of another then the first tuple is definitely less than the second; only if hints are equal tuple_compare must be called for getting comparison result. Hints are calculated using only the first part of key_def. Hints are only useful when: * they are precalculated and stored along with the tuple; calculation is not cheap (cheaper than tuple_compare btw) but precalculated hints allow to compare tuples without even fetching tuple data. * first part of key_def is 'string'(for now) * since hint is calculated using only the first part of key_def (and only first several characters if it is a string) this part must be different with high probability for every tuple pair. Closes #3961 --- src/box/key_def.h | 95 +++++++ src/box/memtx_tree.c | 46 ++- src/box/tuple_compare.cc | 584 +++++++++++++++++++++++++++++++++++++-- src/lib/coll/coll.c | 33 +++ src/lib/coll/coll.h | 4 + 5 files changed, 730 insertions(+), 32 deletions(-) diff --git a/src/box/key_def.h b/src/box/key_def.h index dd62f6a35..0d8f3112e 100644 --- a/src/box/key_def.h +++ b/src/box/key_def.h @@ -137,6 +137,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a, typedef int (*tuple_compare_t)(const struct tuple *tuple_a, const struct tuple *tuple_b, struct key_def *key_def); +/** @copydoc tuple_compare_with_key_hinted() */ +typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple, + uint64_t tuple_hint, + const char *key, + uint32_t part_count, + uint64_t key_hint, + struct key_def *key_def); +/** @copydoc tuple_compare_hinted() */ +typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a, + uint64_t tuple_a_hint, + const struct tuple *tuple_b, + uint64_t tuple_b_hint, + struct key_def *key_def); /** @copydoc tuple_extract_key() */ typedef char *(*tuple_extract_key_t)(const struct tuple *tuple, struct key_def *key_def, @@ -152,6 +165,11 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple, /** @copydoc key_hash() */ typedef uint32_t (*key_hash_t)(const char *key, struct key_def *key_def); +/** @copydoc tuple_hint() */ +typedef uint64_t (*tuple_hint_t)(const struct tuple *tuple, + struct key_def *key_def); +/** @copydoc key_hint() */ +typedef uint64_t (*key_hint_t)(const char *key, struct key_def *key_def); /* Definition of a multipart key. */ struct key_def { @@ -159,6 +177,10 @@ struct key_def { tuple_compare_t tuple_compare; /** @see tuple_compare_with_key() */ tuple_compare_with_key_t tuple_compare_with_key; + /** @see tuple_compare_with_key_hinted() */ + tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted; + /** @see tuple_compare_hinted() */ + tuple_compare_hinted_t tuple_compare_hinted; /** @see tuple_extract_key() */ tuple_extract_key_t tuple_extract_key; /** @see tuple_extract_key_raw() */ @@ -167,6 +189,10 @@ struct key_def { tuple_hash_t tuple_hash; /** @see key_hash() */ key_hash_t key_hash; + /** @see tuple_hint() */ + tuple_hint_t tuple_hint; + /** @see key_hint() */ + key_hint_t key_hint; /** * Minimal part count which always is unique. For example, * if a secondary index is unique, then @@ -571,6 +597,51 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key, return key_def->tuple_compare_with_key(tuple, key, part_count, key_def); } +/** + * Compare tuples using the key definition and comparison hints. + * @param tuple_a First tuple. + * @param tuple_a_hint Comparison hint is calculated for the + * @a tuple_a. + * @param tuple_b Second tuple. + * @param tuple_b_hint Comparison hint is calculated for the + * @a tuple_b. + * @param key_def Key definition. + * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b) + * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b) + * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b) + */ +static inline int +tuple_compare_hinted(const struct tuple *tuple_a, uint64_t tuple_a_hint, + const struct tuple *tuple_b, uint64_t tuple_b_hint, + struct key_def *key_def) +{ + return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b, + tuple_b_hint, key_def); +} + +/** + * Compare tuple with key using the key definition and + * comparison hints. + * @param tuple tuple + * @param tuple_hint Comparison hint is calculated for @a tuple. + * @param key key parts without MessagePack array header + * @param part_count the number of parts in @a key + * @param key_hint t Comparison key kent is calculated for @a key. + * @param key_def key definition + * @retval 0 if key_fields(tuple) == parts(key) + * @retval <0 if key_fields(tuple) < parts(key) + * @retval >0 if key_fields(tuple) > parts(key) + */ +static inline int +tuple_compare_with_key_hinted(const struct tuple *tuple, uint64_t tuple_hint, + const char *key, uint32_t part_count, + uint64_t key_hint, struct key_def *key_def) +{ + return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key, + part_count, key_hint, + key_def); +} + /** * Compute hash of a tuple field. * @param ph1 - pointer to running hash @@ -624,6 +695,30 @@ key_hash(const char *key, struct key_def *key_def) return key_def->key_hash(key, key_def); } + /* + * Get a comparison hint for a @a tuple. + * @param tuple - tuple to get uint64_t of. + * @param key_def - key_def that defines which comparison is used. + * @return the comparison auxiliary information. + */ +static inline uint64_t +tuple_hint(const struct tuple *tuple, struct key_def *key_def) +{ + return key_def->tuple_hint(tuple, key_def); +} + +/** + * Get a comparison hint of a @a key. + * @param key - key to get hint of. + * @param key_def - key_def that defines which comparison is used. + * @return the comparison auxiliary information. + */ +static inline uint64_t +key_hint(const char *key, struct key_def *key_def) +{ + return key_def->key_hint(key, key_def); +} + #if defined(__cplusplus) } /* extern "C" */ #endif /* defined(__cplusplus) */ diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c index d5b42efb6..2c1933ceb 100644 --- a/src/box/memtx_tree.c +++ b/src/box/memtx_tree.c @@ -47,6 +47,11 @@ struct memtx_tree_key_data { const char *key; /** Number of msgpacked search fields. */ uint32_t part_count; + /** + * Comparison hint is calculated with + * key_hint(key, ...). + */ + uint64_t hint; }; /** @@ -55,6 +60,11 @@ struct memtx_tree_key_data { struct memtx_tree_data { /* Tuple that this node is represents. */ struct tuple *tuple; + /** + * Comparison hint is calculated with + * tuple_hint(tuple, ...). + */ + uint64_t hint; }; /** @@ -69,16 +79,18 @@ static bool memtx_tree_data_identical(const struct memtx_tree_data *a, const struct memtx_tree_data *b) { - return a->tuple == b->tuple; + return a->tuple == b->tuple && a->hint == b->hint; } #define BPS_TREE_NAME memtx_tree #define BPS_TREE_BLOCK_SIZE (512) #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE #define BPS_TREE_COMPARE(a, b, arg)\ - tuple_compare((&a)->tuple, (&b)->tuple, arg) + tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\ + (&b)->hint, arg) #define BPS_TREE_COMPARE_KEY(a, b, arg)\ - tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg) + tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\ + (b)->part_count, (b)->hint, arg) #define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b) #define bps_tree_elem_t struct memtx_tree_data #define bps_tree_key_t struct memtx_tree_key_data * @@ -113,7 +125,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c) const struct memtx_tree_data *data_a = a; const struct memtx_tree_data *data_b = b; struct key_def *key_def = c; - return tuple_compare(data_a->tuple, data_b->tuple, key_def); + return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple, + data_b->hint, key_def); } /* {{{ MemtxTree Iterators ****************************************/ @@ -235,9 +248,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret) memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator); /* Use user key def to save a few loops. */ if (res == NULL || - tuple_compare_with_key(res->tuple, it->key_data.key, - it->key_data.part_count, - it->index_def->key_def) != 0) { + tuple_compare_with_key_hinted(res->tuple, res->hint, + it->key_data.key, + it->key_data.part_count, + it->key_data.hint, + it->index_def->key_def) != 0) { iterator->next = tree_iterator_dummie; it->current.tuple = NULL; *ret = NULL; @@ -266,9 +281,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret) memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator); /* Use user key def to save a few loops. */ if (res == NULL || - tuple_compare_with_key(res->tuple, it->key_data.key, - it->key_data.part_count, - it->index_def->key_def) != 0) { + tuple_compare_with_key_hinted(res->tuple, res->hint, + it->key_data.key, + it->key_data.part_count, + it->key_data.hint, + it->index_def->key_def) != 0) { iterator->next = tree_iterator_dummie; it->current.tuple = NULL; *ret = NULL; @@ -516,9 +533,11 @@ memtx_tree_index_get(struct index *base, const char *key, assert(base->def->opts.is_unique && part_count == base->def->key_def->part_count); struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_index_cmp_def(index); struct memtx_tree_key_data key_data; key_data.key = key; key_data.part_count = part_count; + key_data.hint = key_hint(key, cmp_def); struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data); *result = res != NULL ? res->tuple : NULL; return 0; @@ -530,9 +549,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, struct tuple **result) { struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *key_def = index->tree.arg; if (new_tuple) { struct memtx_tree_data new_data; new_data.tuple = new_tuple; + new_data.hint = tuple_hint(new_tuple, key_def); struct memtx_tree_data dup_data; dup_data.tuple = NULL; @@ -565,6 +586,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, if (old_tuple) { struct memtx_tree_data old_data; old_data.tuple = old_tuple; + old_data.hint = tuple_hint(old_tuple, key_def); memtx_tree_delete(&index->tree, old_data); } *result = old_tuple; @@ -607,6 +629,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type, it->type = type; it->key_data.key = key; it->key_data.part_count = part_count; + struct key_def *cmp_def = memtx_tree_index_cmp_def(index); + it->key_data.hint = key_hint(key, cmp_def); it->index_def = base->def; it->tree = &index->tree; it->tree_iterator = memtx_tree_invalid_iterator(); @@ -671,6 +695,8 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple) struct memtx_tree_data *elem = &index->build_array[index->build_array_size++]; elem->tuple = tuple; + struct key_def *key_def = index->tree.arg; + elem->hint = tuple_hint(tuple, key_def); return 0; } diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index 1bcff2ca3..86922c9bb 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -34,6 +34,383 @@ #include "trivia/util.h" /* NOINLINE */ #include <math.h> +/** + * Tuple hints + * + * Hint is a value h(t) is calculated for tuple in terms + * of particular key_def that has the following rules: + * if h(t1) < h(t2) then t1 < t2; + * if h(t1) > h(t2) then t1 > t2; + * if t1 == t2 then regular comparision is required; + * These rules mean that instead of direct tuple vs tuple + * (or tuple vs key) comparison one may compare theirs + * hints first; and only if theirs hints equal compare + * the tuples themselves. + * + * The comparison hint has the following structure: + * uint64_t: [ CMP_HINT_TYPE | CMP_HINT_VALUE ] + * 64 62 0 bit + * + * The reserved value HINT_INVALID is returned when hint is + * undefined. + */ +#define HINT_MAX (((int64_t)1 << 61) - 1) +#define HINT_MIN (-HINT_MAX - 1) +#define HINT_UMAX (((uint64_t)1 << 62) - 1) + +#define HINT_INVALID UINT64_MAX + +#define HINT_TYPE_NUMBER ((uint8_t)1) +#define HINT_TYPE_STRING ((uint8_t)2) +#define HINT_TYPE_BOOL ((uint8_t)3) + +#define HINT_TYPE(hint) ((uint64_t)hint >> 62) +#define HINT_VALUE(hint) ((uint64_t)hint & (((uint64_t)1 << 62) - 1)) +#define HINT(type, value) ({ \ + assert(HINT_TYPE(value) == 0); \ + (((uint64_t)type << 62) | (uint64_t)value);}) + +/** + * Perform hints comparison. + * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a; + * If the hints are incompatible or equal, a value of 0 is + * returned, and the further comparison of the original data is + * required. + */ +static int +hint_cmp(uint64_t hint_a, uint64_t hint_b) +{ + if (hint_a != HINT_INVALID && hint_b != HINT_INVALID && + HINT_TYPE(hint_a) == HINT_TYPE(hint_b) && + HINT_VALUE(hint_a) != HINT_VALUE(hint_b)) + return hint_a < hint_b ? -1 : 1; + else + return 0; +} + +template<bool is_optional, bool is_nullable> +static uint64_t +key_hint_uint(const char *key, struct key_def *key_def) +{ + (void)key_def; + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_UNSIGNED); + uint64_t ret; + if ((is_optional && key == NULL) || + (is_nullable && mp_typeof(*key) == MP_NIL)) + return HINT(HINT_TYPE_NUMBER, 0); + assert(mp_typeof(*key) == MP_UINT); + uint64_t val = mp_decode_uint(&key); + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN; + return HINT(HINT_TYPE_NUMBER, ret); +} + +template<bool is_nullable> +static uint64_t +tuple_hint_uint(const struct tuple *tuple, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_UNSIGNED); + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return HINT(HINT_TYPE_NUMBER, 0); + return key_hint_uint<false, is_nullable>(field, key_def); +} + +template<bool is_optional, bool is_nullable> +static uint64_t +key_hint_int(const char *key, struct key_def *key_def) +{ + (void)key_def; + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_INTEGER); + uint64_t ret; + if ((is_optional && key == NULL) || + (is_nullable && mp_typeof(*key) == MP_NIL)) + return HINT(HINT_TYPE_NUMBER, 0); + if (mp_typeof(*key) == MP_UINT) { + uint64_t val = mp_decode_uint(&key); + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN; + } else { + assert(mp_typeof(*key) == MP_INT); + int64_t val = mp_decode_int(&key); + ret = val > HINT_MAX ? HINT_UMAX : + val < HINT_MIN ? 0 : + (uint64_t)val - (uint64_t)HINT_MIN; + } + return HINT(HINT_TYPE_NUMBER, ret); +} + +template<bool is_nullable> +static uint64_t +tuple_hint_int(const struct tuple *tuple, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_INTEGER); + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return HINT(HINT_TYPE_NUMBER, 0); + return key_hint_int<false, is_nullable>(field, key_def); +} + +template<bool is_optional, bool is_nullable> +static uint64_t +key_hint_number(const char *key, struct key_def *key_def) +{ + (void)key_def; + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_NUMBER || + key_def->parts->type == FIELD_TYPE_SCALAR); + if ((is_optional && key == NULL) || + (is_nullable && mp_typeof(*key) == MP_NIL)) + return HINT(HINT_TYPE_NUMBER, 0); + uint64_t ret; + switch (mp_typeof(*key)) { + case MP_FLOAT: + case MP_DOUBLE: { + double f = mp_typeof(*key) == MP_FLOAT ? + mp_decode_float(&key) : mp_decode_double(&key); + if (!isfinite(f)) + return HINT_INVALID; + double val; + (void)modf(f, &val); + ret = val > (double)HINT_MAX ? HINT_UMAX : + val < (double)HINT_MIN ? 0 : + (uint64_t)val - (uint64_t)HINT_MIN; + break; + } + case MP_INT: { + int64_t val = (uint64_t)mp_decode_int(&key); + ret = val > HINT_MAX ? HINT_UMAX : + val < HINT_MIN ? 0 : + (uint64_t)val - (uint64_t)HINT_MIN; + break; + } + case MP_UINT: { + uint64_t val = mp_decode_uint(&key); + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN; + break; + } + default: + unreachable(); + } + return HINT(HINT_TYPE_NUMBER, ret); +} + +template<bool is_nullable> +static uint64_t +tuple_hint_number(const struct tuple *tuple, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_NUMBER); + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return HINT(HINT_TYPE_NUMBER, 0); + return key_hint_number<false, is_nullable>(field, key_def); +} + +template<bool is_optional, bool is_nullable> +static uint64_t +key_hint_boolean(const char *key, struct key_def *key_def) +{ + (void)key_def; + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_BOOLEAN || + key_def->parts->type == FIELD_TYPE_SCALAR); + if ((is_optional && key == NULL) || + (is_nullable && mp_typeof(*key) == MP_NIL)) + return HINT(HINT_TYPE_BOOL, 0); + return (uint64_t)mp_decode_bool(&key); +} + +template<bool is_nullable> +static uint64_t +tuple_hint_boolean(const struct tuple *tuple, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_BOOLEAN); + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return HINT(HINT_TYPE_BOOL, 0); + return key_hint_boolean<false, is_nullable>(field, key_def); +} + +template<bool is_optional, bool is_nullable> +static uint64_t +key_hint_string(const char *key, struct key_def *key_def) +{ + (void)key_def; + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->coll == NULL); + assert(key_def->parts->type == FIELD_TYPE_STRING || + key_def->parts->type == FIELD_TYPE_SCALAR); + if ((is_optional && key == NULL) || + (is_nullable && mp_typeof(*key) == MP_NIL)) + return HINT(HINT_TYPE_STRING, 0); + assert(mp_typeof(*key) == MP_STR); + uint32_t len; + const unsigned char *str = + (const unsigned char *)mp_decode_str(&key, &len); + uint64_t result = 0; + uint32_t process_len = MIN(len, 7); + for (uint32_t i = 0; i < process_len; i++) { + result <<= 8; + result |= str[i]; + } + result <<= 8 * (7 - process_len); + return HINT(HINT_TYPE_STRING, result); +} + +template<bool is_nullable> +static uint64_t +tuple_hint_string(const struct tuple *tuple, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->coll == NULL); + assert(key_def->parts->type == FIELD_TYPE_STRING); + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return HINT(HINT_TYPE_STRING, 0); + return key_hint_string<false, is_nullable>(field, key_def); +} + +template<bool is_optional, bool is_nullable> +static uint64_t +key_hint_string_coll(const char *key, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert((key_def->parts->type == FIELD_TYPE_STRING || + key_def->parts->type == FIELD_TYPE_SCALAR) && + key_def->parts->coll != NULL); + if ((is_optional && key == NULL) || + (is_nullable && mp_typeof(*key) == MP_NIL)) + return HINT(HINT_TYPE_STRING, 0); + assert(mp_typeof(*key) == MP_STR); + uint32_t len; + const char *str = mp_decode_str(&key, &len); + return key_def->parts->coll->hint(str, len, key_def->parts->coll); +} + +template<bool is_nullable> +static uint64_t +tuple_hint_string_coll(const struct tuple *tuple, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_STRING && + key_def->parts->coll != NULL); + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return HINT(HINT_TYPE_STRING, 0); + return key_hint_string_coll<false, is_nullable>(field, key_def); +} + +template<bool is_optional, bool is_nullable> +static uint64_t +key_hint_scalar(const char *key, struct key_def *key_def) +{ + (void)key_def; + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_SCALAR); + if ((is_optional && key == NULL) || + (is_nullable && mp_typeof(*key) == MP_NIL)) + return HINT(HINT_TYPE_BOOL, 0); + switch (mp_typeof(*key)) { + case MP_INT: + case MP_UINT: + case MP_FLOAT: + case MP_DOUBLE: + return key_hint_number<is_optional, is_nullable>(key, key_def); + case MP_BOOL: + return key_hint_boolean<is_optional, is_nullable>(key, key_def); + case MP_STR: + if (key_def->parts->coll == NULL) + return key_hint_string<is_optional, + is_nullable>(key, key_def); + else + return key_hint_string_coll<is_optional, + is_nullable>(key, key_def); + default: + return HINT_INVALID; + } +} + +template<bool is_nullable> +static uint64_t +tuple_hint_scalar(const struct tuple *tuple, struct key_def *key_def) +{ + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(key_def->parts->type == FIELD_TYPE_SCALAR); + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return HINT(HINT_TYPE_BOOL, 0); + return key_hint_scalar<false, is_nullable>(field, key_def); +} + +void +key_def_set_hint_func(struct key_def *def) +{ + def->key_hint = NULL; + def->tuple_hint = NULL; + bool is_nullable = key_part_is_nullable(def->parts); + switch (def->parts->type) { + case FIELD_TYPE_UNSIGNED: { + def->key_hint = is_nullable ? key_hint_uint<true, true> : + key_hint_uint<true, false>; + def->tuple_hint = is_nullable ? tuple_hint_uint<true> : + tuple_hint_uint<false>; + break; + } + case FIELD_TYPE_INTEGER: { + def->key_hint = is_nullable ? key_hint_int<true, true> : + key_hint_int<true, false>; + def->tuple_hint = is_nullable ? tuple_hint_int<true> : + tuple_hint_int<false>; + break; + } + case FIELD_TYPE_STRING: { + if (def->parts->coll != NULL) { + def->key_hint = + is_nullable ? key_hint_string_coll<true, true> : + key_hint_string_coll<true, false>; + def->tuple_hint = + is_nullable ? tuple_hint_string_coll<true> : + tuple_hint_string_coll<false>; + } else { + def->key_hint = + is_nullable ? key_hint_string<true, true> : + key_hint_string<true, false>; + def->tuple_hint = is_nullable ? tuple_hint_string<true> : + tuple_hint_string<false>; + } + break; + } + case FIELD_TYPE_NUMBER: { + def->key_hint = is_nullable ? key_hint_number<true, true> : + key_hint_number<true, false>; + def->tuple_hint = is_nullable ? tuple_hint_number<true> : + tuple_hint_number<false>; + break; + } + case FIELD_TYPE_BOOLEAN: { + def->key_hint = is_nullable ? key_hint_boolean<true, true> : + key_hint_boolean<true, false>; + def->tuple_hint = is_nullable ? tuple_hint_boolean<true> : + tuple_hint_boolean<false>; + break; + } + case FIELD_TYPE_SCALAR: { + def->key_hint = is_nullable ? key_hint_scalar<true, true> : + key_hint_scalar<true, false>; + def->tuple_hint = is_nullable ? tuple_hint_scalar<true> : + tuple_hint_scalar<false>; + break; + } + default: { + break; + } + } +} + /* {{{ tuple_compare */ /* @@ -460,13 +837,17 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b, template<bool is_nullable, bool has_optional_parts, bool has_json_paths> static inline int -tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, - struct key_def *key_def) +tuple_compare_slowpath_hinted(const struct tuple *tuple_a, + uint64_t tuple_a_hint, const struct tuple *tuple_b, + uint64_t tuple_b_hint, struct key_def *key_def) { assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || is_nullable); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + int rc = 0; + if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0) + return rc; struct key_part *part = key_def->parts; const char *tuple_a_raw = tuple_data(tuple_a); const char *tuple_b_raw = tuple_data(tuple_b); @@ -502,7 +883,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, struct key_part *end; const char *field_a, *field_b; enum mp_type a_type, b_type; - int rc; if (is_nullable) end = part + key_def->unique_part_count; else @@ -592,8 +972,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, template<bool is_nullable, bool has_optional_parts, bool has_json_paths> static inline int -tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, - uint32_t part_count, struct key_def *key_def) +tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, + struct key_def *key_def) +{ + return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts, + has_json_paths>(tuple_a, HINT_INVALID, tuple_b, + HINT_INVALID, key_def); +} + +template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +static inline int +tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple, + uint64_t tuple_hint, const char *key, uint32_t part_count, + uint64_t key_hint, struct key_def *key_def) { assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || is_nullable); @@ -601,6 +992,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, assert(has_optional_parts == key_def->has_optional_parts); assert(key != NULL || part_count == 0); assert(part_count <= key_def->part_count); + int rc = 0; + if ((rc = hint_cmp(tuple_hint, key_hint)) != 0) + return rc; struct key_part *part = key_def->parts; struct tuple_format *format = tuple_format(tuple); const char *tuple_raw = tuple_data(tuple); @@ -636,7 +1030,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, } struct key_part *end = part + part_count; - int rc; for (; part < end; ++part, mp_next(&key)) { const char *field; if (has_json_paths) { @@ -675,6 +1068,17 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, return 0; } +template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +static inline int +tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, + uint32_t part_count, struct key_def *key_def) +{ + return tuple_compare_with_key_slowpath_hinted<is_nullable, + has_optional_parts, has_json_paths>(tuple, + HINT_INVALID, key, part_count, + HINT_INVALID, key_def); +} + template<bool is_nullable> static inline int key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count, @@ -733,13 +1137,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count, template<bool is_nullable, bool has_optional_parts> static inline int -tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, - uint32_t part_count, struct key_def *key_def) +tuple_compare_with_key_sequential_hinted(const struct tuple *tuple, + uint64_t tuple_hint, const char *key, uint32_t part_count, + uint64_t key_hint, struct key_def *key_def) { assert(!has_optional_parts || is_nullable); assert(key_def_is_sequential(key_def)); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; const char *tuple_key = tuple_data(tuple); uint32_t field_count = mp_decode_array(&tuple_key); uint32_t cmp_part_count; @@ -749,8 +1157,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, assert(field_count >= part_count); cmp_part_count = part_count; } - int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count, - key_def); + rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count, + key_def); if (!has_optional_parts || rc != 0) return rc; /* @@ -772,6 +1180,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, return 0; } +template<bool is_nullable, bool has_optional_parts> +static inline int +tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, + uint32_t part_count, struct key_def *key_def) +{ + return tuple_compare_with_key_sequential_hinted<is_nullable, + has_optional_parts>(tuple, HINT_INVALID, key, + part_count, HINT_INVALID, key_def); +} + int key_compare(const char *key_a, const char *key_b, struct key_def *key_def) { @@ -792,9 +1210,13 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def) template <bool is_nullable, bool has_optional_parts> static int -tuple_compare_sequential(const struct tuple *tuple_a, - const struct tuple *tuple_b, key_def *key_def) +tuple_compare_sequential_hinted(const struct tuple *tuple_a, + uint64_t tuple_a_hint, const struct tuple *tuple_b, + uint64_t tuple_b_hint, key_def *key_def) { + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; assert(!has_optional_parts || is_nullable); assert(has_optional_parts == key_def->has_optional_parts); assert(key_def_is_sequential(key_def)); @@ -812,7 +1234,6 @@ tuple_compare_sequential(const struct tuple *tuple_a, bool was_null_met = false; struct key_part *part = key_def->parts; struct key_part *end = part + key_def->unique_part_count; - int rc; uint32_t i = 0; for (; part < end; ++part, ++i) { enum mp_type a_type, b_type; @@ -859,6 +1280,16 @@ tuple_compare_sequential(const struct tuple *tuple_a, return 0; } +template <bool is_nullable, bool has_optional_parts> +static int +tuple_compare_sequential(const struct tuple *tuple_a, + const struct tuple *tuple_b, key_def *key_def) +{ + return tuple_compare_sequential_hinted<is_nullable, + has_optional_parts>(tuple_a, HINT_INVALID, tuple_b, + HINT_INVALID, key_def); +} + template <int TYPE> static inline int field_compare(const char **field_a, const char **field_b); @@ -989,6 +1420,18 @@ struct TupleCompare compare(tuple_a, tuple_b, format_a, format_b, field_a, field_b); } + + static int compare_hinted(const struct tuple *tuple_a, + uint64_t tuple_a_hint, + const struct tuple *tuple_b, + uint64_t tuple_b_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; + return compare(tuple_a, tuple_b, key_def); + } }; template <int TYPE, int ...MORE_TYPES> @@ -1006,15 +1449,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> { return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b, format_a, format_b, field_a, field_b); } + + static int compare_hinted(const struct tuple *tuple_a, + uint64_t tuple_a_hint, + const struct tuple *tuple_b, + uint64_t tuple_b_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; + return compare(tuple_a, tuple_b, key_def); + } }; } /* end of anonymous namespace */ struct comparator_signature { tuple_compare_t f; + tuple_compare_hinted_t f_hinted; uint32_t p[64]; }; #define COMPARATOR(...) \ - { TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } }, + { TupleCompare<__VA_ARGS__>::compare, \ + TupleCompare<__VA_ARGS__>::compare_hinted, \ + { __VA_ARGS__, UINT32_MAX } }, /** * field1 no, field1 type, field2 no, field2 type, ... @@ -1049,6 +1507,17 @@ static const tuple_compare_t compare_slowpath_funcs[] = { tuple_compare_slowpath<true, true, true> }; +static const tuple_compare_hinted_t compare_slowpath_hinted_funcs[] = { + tuple_compare_slowpath_hinted<false, false, false>, + tuple_compare_slowpath_hinted<true, false, false>, + tuple_compare_slowpath_hinted<false, true, false>, + tuple_compare_slowpath_hinted<true, true, false>, + tuple_compare_slowpath_hinted<false, false, true>, + tuple_compare_slowpath_hinted<true, false, true>, + tuple_compare_slowpath_hinted<false, true, true>, + tuple_compare_slowpath_hinted<true, true, true> +}; + void key_def_set_tuple_compare(struct key_def *def) { @@ -1060,13 +1529,21 @@ key_def_set_tuple_compare(struct key_def *def) if (def->has_optional_parts) { def->tuple_compare = tuple_compare_sequential<true, true>; + def->tuple_compare_hinted = + tuple_compare_sequential_hinted<true, + true>; } else { def->tuple_compare = tuple_compare_sequential<true, false>; + def->tuple_compare_hinted = + tuple_compare_sequential_hinted<true, + false>; } } else { def->tuple_compare = compare_slowpath_funcs[cmp_func_idx]; + def->tuple_compare_hinted = + compare_slowpath_hinted_funcs[cmp_func_idx]; } return; } @@ -1085,13 +1562,20 @@ key_def_set_tuple_compare(struct key_def *def) if (i == def->part_count && cmp_arr[k].p[i * 2] == UINT32_MAX) { def->tuple_compare = cmp_arr[k].f; + def->tuple_compare_hinted = cmp_arr[k].f_hinted; return; } } } - def->tuple_compare = key_def_is_sequential(def) ? - tuple_compare_sequential<false, false> : - compare_slowpath_funcs[cmp_func_idx]; + if (key_def_is_sequential(def)) { + def->tuple_compare = tuple_compare_sequential<false, false>; + def->tuple_compare_hinted = + tuple_compare_sequential_hinted<false, false>; + } else { + def->tuple_compare = compare_slowpath_funcs[cmp_func_idx]; + def->tuple_compare_hinted = + compare_slowpath_hinted_funcs[cmp_func_idx]; + } } /* }}} tuple_compare */ @@ -1222,6 +1706,17 @@ struct TupleCompareWithKey compare(tuple, key, part_count, key_def, format, field); } + + static int + compare_hinted(const struct tuple *tuple, uint64_t tuple_hint, + const char *key, uint32_t part_count, uint64_t key_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; + return compare(tuple, key, part_count, key_def); + } }; template <int TYPE, int ...MORE_TYPES> @@ -1242,6 +1737,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...> compare(tuple, key, part_count, key_def, format, field); } + + static int + compare_hinted(const struct tuple *tuple, uint64_t tuple_hint, + const char *key, uint32_t part_count, uint64_t key_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; + return compare(tuple, key, part_count, key_def); + } }; } /* end of anonymous namespace */ @@ -1249,11 +1755,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...> struct comparator_with_key_signature { tuple_compare_with_key_t f; + tuple_compare_with_key_hinted_t f_hinted; uint32_t p[64]; }; #define KEY_COMPARATOR(...) \ - { TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } }, + { TupleCompareWithKey<0, __VA_ARGS__>::compare, \ + TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \ + { __VA_ARGS__ } }, static const comparator_with_key_signature cmp_wk_arr[] = { KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED) @@ -1284,6 +1793,18 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = { tuple_compare_with_key_slowpath<true, true, true> }; +static const tuple_compare_with_key_hinted_t + compare_with_key_slowpath_hinted_funcs[] = { + tuple_compare_with_key_slowpath_hinted<false, false, false>, + tuple_compare_with_key_slowpath_hinted<true, false, false>, + tuple_compare_with_key_slowpath_hinted<false, true, false>, + tuple_compare_with_key_slowpath_hinted<true, true, false>, + tuple_compare_with_key_slowpath_hinted<false, false, true>, + tuple_compare_with_key_slowpath_hinted<true, false, true>, + tuple_compare_with_key_slowpath_hinted<false, true, true>, + tuple_compare_with_key_slowpath_hinted<true, true, true> +}; + void key_def_set_tuple_compare_with_key(struct key_def *def) { @@ -1296,14 +1817,23 @@ key_def_set_tuple_compare_with_key(struct key_def *def) def->tuple_compare_with_key = tuple_compare_with_key_sequential<true, true>; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_sequential_hinted< + true, true>; } else { def->tuple_compare_with_key = tuple_compare_with_key_sequential<true, false>; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_sequential_hinted< + true, false>; } } else { def->tuple_compare_with_key = compare_with_key_slowpath_funcs[cmp_func_idx]; + def->tuple_compare_with_key_hinted = + compare_with_key_slowpath_hinted_funcs[ + cmp_func_idx]; } return; } @@ -1325,14 +1855,23 @@ key_def_set_tuple_compare_with_key(struct key_def *def) } if (i == def->part_count) { def->tuple_compare_with_key = cmp_wk_arr[k].f; + def->tuple_compare_with_key_hinted = + cmp_wk_arr[k].f_hinted; return; } } } - def->tuple_compare_with_key = - key_def_is_sequential(def) ? - tuple_compare_with_key_sequential<false, false> : - compare_with_key_slowpath_funcs[cmp_func_idx]; + if (key_def_is_sequential(def)) { + def->tuple_compare_with_key = + tuple_compare_with_key_sequential<false, false>; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_sequential_hinted<false, false>; + } else { + def->tuple_compare_with_key = + compare_with_key_slowpath_funcs[cmp_func_idx]; + def->tuple_compare_with_key_hinted = + compare_with_key_slowpath_hinted_funcs[cmp_func_idx]; + } } /* }}} tuple_compare_with_key */ @@ -1342,4 +1881,5 @@ key_def_set_compare_func(struct key_def *def) { key_def_set_tuple_compare(def); key_def_set_tuple_compare_with_key(def); + key_def_set_hint_func(def); } diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c index afc15e809..3bd7c44f5 100644 --- a/src/lib/coll/coll.c +++ b/src/lib/coll/coll.c @@ -133,6 +133,37 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry, return s_len; } +/** Get a compare hint of a binary collation. */ +static uint64_t +coll_bin_hint(const char *s, size_t s_len, struct coll *coll) +{ + (void)coll; + uint64_t result = 0; + uint32_t process_len = MIN(s_len, 7); + for (uint32_t i = 0; i < process_len; i++) { + result <<= 8; + result |= ((unsigned char *)s)[i]; + } + result <<= 8 * (7 - process_len); + return result; +} + +/** Get a compare hint of a string using ICU collation. */ +static uint64_t +coll_icu_hint(const char *s, size_t s_len, struct coll *coll) +{ + assert(coll->type == COLL_TYPE_ICU); + UCharIterator itr; + uiter_setUTF8(&itr, s, s_len); + uint8_t buf[7]; + uint32_t state[2] = {0, 0}; + UErrorCode status = U_ZERO_ERROR; + int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state, buf, + sizeof(buf), &status); + assert(got >= 0 && got <= 7); + return coll_bin_hint((const char *)buf, got, NULL); +} + /** * Set up ICU collator and init cmp and hash members of collation. * @param coll Collation to set up. @@ -242,6 +273,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def) } coll->cmp = coll_icu_cmp; coll->hash = coll_icu_hash; + coll->hint = coll_icu_hint; return 0; } @@ -356,6 +388,7 @@ coll_new(const struct coll_def *def) coll->collator = NULL; coll->cmp = coll_bin_cmp; coll->hash = coll_bin_hash; + coll->hint = coll_bin_hint; break; default: unreachable(); diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h index 8c9f94293..d695a02ad 100644 --- a/src/lib/coll/coll.h +++ b/src/lib/coll/coll.h @@ -48,6 +48,8 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t, typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry, struct coll *coll); +typedef uint64_t (*coll_hint_f)(const char *s, size_t s_len, struct coll *coll); + struct UCollator; /** @@ -62,6 +64,8 @@ struct coll { /** String comparator. */ coll_cmp_f cmp; coll_hash_f hash; + /** The pointer to routine calculating tuple hint. */ + coll_hint_f hint; /** Reference counter. */ int refs; /** -- 2.21.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v6 2/3] memtx: introduce tuple compare hint 2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov @ 2019-03-14 8:19 ` Vladimir Davydov 2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov 0 siblings, 1 reply; 11+ messages in thread From: Vladimir Davydov @ 2019-03-14 8:19 UTC (permalink / raw) To: Kirill Shcherbatov; +Cc: tarantool-patches On Wed, Mar 13, 2019 at 03:15:38PM +0300, Kirill Shcherbatov wrote: > Implement functions for retrieving tuple hints for a particular > key_def. Hint is an integer that can be used for tuple comparison > optimization: if a hint of one tuple is less than a hint of another > then the first tuple is definitely less than the second; only if > hints are equal tuple_compare must be called for getting comparison > result. > > Hints are calculated using only the first part of key_def. > > Hints are only useful when: > * they are precalculated and stored along with the tuple; > calculation is not cheap (cheaper than tuple_compare btw) but > precalculated hints allow to compare tuples without even fetching > tuple data. > * first part of key_def is 'string'(for now) > * since hint is calculated using only the first part of key_def > (and only first several characters if it is a string) this part > must be different with high probability for every tuple pair. Please refresh the comment. It isn't quite correct. > > Closes #3961 > --- > src/box/key_def.h | 95 +++++++ > src/box/memtx_tree.c | 46 ++- > src/box/tuple_compare.cc | 584 +++++++++++++++++++++++++++++++++++++-- > src/lib/coll/coll.c | 33 +++ > src/lib/coll/coll.h | 4 + > 5 files changed, 730 insertions(+), 32 deletions(-) Please add tests checking that changing an indexed field type from scalar to a basic type and vice versa without rebuilding the index doesn't break ordering. > > diff --git a/src/box/key_def.h b/src/box/key_def.h > index dd62f6a35..0d8f3112e 100644 > --- a/src/box/key_def.h > +++ b/src/box/key_def.h > @@ -137,6 +137,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a, > typedef int (*tuple_compare_t)(const struct tuple *tuple_a, > const struct tuple *tuple_b, > struct key_def *key_def); > +/** @copydoc tuple_compare_with_key_hinted() */ > +typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple, > + uint64_t tuple_hint, > + const char *key, > + uint32_t part_count, > + uint64_t key_hint, > + struct key_def *key_def); On the second thought, we might want to reduce the hint size or increase it in future. Would it be better to introduce a helper type for hints so that we could easily do this if we need to? typedef uint64_t tuple_hint_t; What do you think? Would it be worthwhile? Would it look OK? Sorry, I asked you to use uint64_t before. Looks to me now that I was wrong... :-( > @@ -571,6 +597,51 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key, > return key_def->tuple_compare_with_key(tuple, key, part_count, key_def); > } > > +/** > + * Compare tuples using the key definition and comparison hints. > + * @param tuple_a First tuple. > + * @param tuple_a_hint Comparison hint is calculated for the > + * @a tuple_a. Should be "Comparison hint of @a tuple_a" Please fix here and everywhere else where appropriate. > + * @param tuple_b Second tuple. > + * @param tuple_b_hint Comparison hint is calculated for the > + * @a tuple_b. > + * @param key_def Key definition. > + * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b) > + * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b) > + * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b) > + */ > +static inline int > +tuple_compare_hinted(const struct tuple *tuple_a, uint64_t tuple_a_hint, > + const struct tuple *tuple_b, uint64_t tuple_b_hint, > + struct key_def *key_def) > +{ > + return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b, > + tuple_b_hint, key_def); > +} > + > +/** > + * Compare tuple with key using the key definition and > + * comparison hints. > + * @param tuple tuple > + * @param tuple_hint Comparison hint is calculated for @a tuple. > + * @param key key parts without MessagePack array header > + * @param part_count the number of parts in @a key This is nitpicking, of course, but please be consistent. If you put dot (.) at the end of a sentence, put it everywhere. > + * @param key_hint t Comparison key kent is calculated for @a key. Comparison hint of @a key. > + * @param key_def key definition > + * @retval 0 if key_fields(tuple) == parts(key) > + * @retval <0 if key_fields(tuple) < parts(key) > + * @retval >0 if key_fields(tuple) > parts(key) > + */ > +static inline int > +tuple_compare_with_key_hinted(const struct tuple *tuple, uint64_t tuple_hint, > + const char *key, uint32_t part_count, > + uint64_t key_hint, struct key_def *key_def) > +{ > + return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key, > + part_count, key_hint, > + key_def); > +} > + > /** > * Compute hash of a tuple field. > * @param ph1 - pointer to running hash > @@ -624,6 +695,30 @@ key_hash(const char *key, struct key_def *key_def) > return key_def->key_hash(key, key_def); > } > > + /* > + * Get a comparison hint for a @a tuple. The article before 'tuple' ('a') is redundant. > + * @param tuple - tuple to get uint64_t of. tuple to compute the hint for. > + * @param key_def - key_def that defines which comparison is used. key definition used for tuple comparison. > + * @return the comparison auxiliary information. the hint. > + */ > +static inline uint64_t > +tuple_hint(const struct tuple *tuple, struct key_def *key_def) > +{ > + return key_def->tuple_hint(tuple, key_def); > +} > + > +/** > + * Get a comparison hint of a @a key. > + * @param key - key to get hint of. > + * @param key_def - key_def that defines which comparison is used. > + * @return the comparison auxiliary information. Ditto. > + */ > +static inline uint64_t > +key_hint(const char *key, struct key_def *key_def) > +{ > + return key_def->key_hint(key, key_def); > +} > + > #if defined(__cplusplus) > } /* extern "C" */ > #endif /* defined(__cplusplus) */ > diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c > index d5b42efb6..2c1933ceb 100644 > --- a/src/box/memtx_tree.c > +++ b/src/box/memtx_tree.c > @@ -47,6 +47,11 @@ struct memtx_tree_key_data { > const char *key; > /** Number of msgpacked search fields. */ > uint32_t part_count; > + /** > + * Comparison hint is calculated with > + * key_hint(key, ...). > + */ /** Comparison hint, see tuple_hint(). */ > + uint64_t hint; > }; > > /** > @@ -55,6 +60,11 @@ struct memtx_tree_key_data { > struct memtx_tree_data { > /* Tuple that this node is represents. */ > struct tuple *tuple; > + /** > + * Comparison hint is calculated with > + * tuple_hint(tuple, ...). > + */ /** Comparison hint, see key_hint(). */ > + uint64_t hint; > }; > > /** > @@ -69,16 +79,18 @@ static bool > memtx_tree_data_identical(const struct memtx_tree_data *a, > const struct memtx_tree_data *b) > { > - return a->tuple == b->tuple; > + return a->tuple == b->tuple && a->hint == b->hint; > } > > #define BPS_TREE_NAME memtx_tree > #define BPS_TREE_BLOCK_SIZE (512) > #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE > #define BPS_TREE_COMPARE(a, b, arg)\ > - tuple_compare((&a)->tuple, (&b)->tuple, arg) > + tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\ > + (&b)->hint, arg) Hmm, (&a)->tuple is equivalent to (a).tuple, isn't it? > @@ -516,9 +533,11 @@ memtx_tree_index_get(struct index *base, const char *key, > assert(base->def->opts.is_unique && > part_count == base->def->key_def->part_count); > struct memtx_tree_index *index = (struct memtx_tree_index *)base; > + struct key_def *cmp_def = memtx_tree_index_cmp_def(index); > struct memtx_tree_key_data key_data; > key_data.key = key; > key_data.part_count = part_count; > + key_data.hint = key_hint(key, cmp_def); > struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data); > *result = res != NULL ? res->tuple : NULL; > return 0; > @@ -530,9 +549,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, > struct tuple **result) > { > struct memtx_tree_index *index = (struct memtx_tree_index *)base; > + struct key_def *key_def = index->tree.arg; Accessing index->tree.arg looks ugly - it violates encapsulation. Besides, right above you use index_def_cmp_def instead. I think we better introduce a helper function for accessing index->tree.arg and use it everywhere: memtx_tree_cmp_def(&index->tree) > diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc > index 1bcff2ca3..86922c9bb 100644 > --- a/src/box/tuple_compare.cc > +++ b/src/box/tuple_compare.cc > @@ -34,6 +34,383 @@ > #include "trivia/util.h" /* NOINLINE */ > #include <math.h> > > +/** > + * Tuple hints If you agree to add tuple_hint_t, then please move this comment to the type declaration. Don't forget to update it when you introduce multikey indexes. > + * > + * Hint is a value h(t) is calculated for tuple in terms > + * of particular key_def that has the following rules: Tuple comparison hint h(t) is such a function of tuple t that the following conditions always hold for any pair of tuples t1 and t2: > + * if h(t1) < h(t2) then t1 < t2; > + * if h(t1) > h(t2) then t1 > t2; > + * if t1 == t2 then regular comparision is required; if h(t1) == h(t2) then t1 may or may not be equal to t2. > + * These rules mean that instead of direct tuple vs tuple > + * (or tuple vs key) comparison one may compare theirs > + * hints first; and only if theirs hints equal compare > + * the tuples themselves. > + * > + * The comparison hint has the following structure: > + * uint64_t: [ CMP_HINT_TYPE | CMP_HINT_VALUE ] > + * 64 62 0 bit > + * > + * The reserved value HINT_INVALID is returned when hint is > + * undefined. > + */ > +#define HINT_MAX (((int64_t)1 << 61) - 1) > +#define HINT_MIN (-HINT_MAX - 1) > +#define HINT_UMAX (((uint64_t)1 << 62) - 1) What's HINT_UMAX? Please remove if possible. > + > +#define HINT_INVALID UINT64_MAX HINT_NONE would be a better name IMO, because it would stress on the fact that the hint is absent. > + > +#define HINT_TYPE_NUMBER ((uint8_t)1) > +#define HINT_TYPE_STRING ((uint8_t)2) > +#define HINT_TYPE_BOOL ((uint8_t)3) > + > +#define HINT_TYPE(hint) ((uint64_t)hint >> 62) > +#define HINT_VALUE(hint) ((uint64_t)hint & (((uint64_t)1 << 62) - 1)) Using 62 directly is error-prone (in case we decide to change it). Please define constants for all numbers you use. > +#define HINT(type, value) ({ \ > + assert(HINT_TYPE(value) == 0); \ > + (((uint64_t)type << 62) | (uint64_t)value);}) Would be better to make these helpers static inline functions. > + > +/** > + * Perform hints comparison. > + * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a; > + * If the hints are incompatible or equal, a value of 0 is > + * returned, and the further comparison of the original data is > + * required. > + */ > +static int > +hint_cmp(uint64_t hint_a, uint64_t hint_b) > +{ > + if (hint_a != HINT_INVALID && hint_b != HINT_INVALID && > + HINT_TYPE(hint_a) == HINT_TYPE(hint_b) && > + HINT_VALUE(hint_a) != HINT_VALUE(hint_b)) > + return hint_a < hint_b ? -1 : 1; Hints must be defined in such a way that we don't need to compare types, i.e. hint(123) < hint("abc"). I suggest you to take a look at mp_class. May be, we could reuse it for hint types? > + else > + return 0; > +} > + > +template<bool is_optional, bool is_nullable> > +static uint64_t > +key_hint_uint(const char *key, struct key_def *key_def) Please use unsigned/integer for these functions so as not to confuse them with MP_INT/UINT. > +{ > + (void)key_def; > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_UNSIGNED); > + uint64_t ret; > + if ((is_optional && key == NULL) || > + (is_nullable && mp_typeof(*key) == MP_NIL)) > + return HINT(HINT_TYPE_NUMBER, 0); > + assert(mp_typeof(*key) == MP_UINT); > + uint64_t val = mp_decode_uint(&key); > + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN; > + return HINT(HINT_TYPE_NUMBER, ret); > +} > + > +template<bool is_nullable> > +static uint64_t > +tuple_hint_uint(const struct tuple *tuple, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_UNSIGNED); > + const char *field = tuple_field_by_part(tuple, key_def->parts); > + if (is_nullable && field == NULL) > + return HINT(HINT_TYPE_NUMBER, 0); > + return key_hint_uint<false, is_nullable>(field, key_def); > +} Although I didn't complain about the way you define hint functions before, I've always had an inkling that something is wrong, that calling key_hint from tuple_hint looks weird. The 'is_optional' template argument you added recently has only strengthened my suspicions: it looks confusing, because in fact it doesn't have anything to do with optional parts we use in tuple_compare templates. I propose the following refactoring. Let's introduce field_hint_* functions (not using templates) that would take only binary data of a particular type and return a hint, e.g. tuple_hint_t field_hint_unsigned(const char *field) { assert(mp_typeof(*field) == MP_UINT); ... } Then add two parametrised functions for computing a hint for a key and a tuple that would look like this: template<enum field_type type, ...> tuple_hint(const struct tuple *tuple, struct key_def *key_def) { switch (type) { case FIELD_TYPE_UNSIGNED: return field_hint_unsigned(...); case FIELD_TYPE_INTEGER: return field_hint_integer(...); ... } } template<enum field_type type, ...> key_hint(...) { switch (type) { case FIELD_TYPE_UNSIGNED: return field_hint_unsigned(...); case FIELD_TYPE_INTEGER: return field_hint_integer(...); ... } } This way all the logic computing hints would be consolidated in field_hint_* family of functions which would be simple and wouldn't even take key_def. This would also allow you to get rid of that ugly switch-case in key_def_set_hint_func. > + > +template<bool is_optional, bool is_nullable> > +static uint64_t > +key_hint_int(const char *key, struct key_def *key_def) > +{ > + (void)key_def; > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_INTEGER); > + uint64_t ret; > + if ((is_optional && key == NULL) || > + (is_nullable && mp_typeof(*key) == MP_NIL)) > + return HINT(HINT_TYPE_NUMBER, 0); > + if (mp_typeof(*key) == MP_UINT) { > + uint64_t val = mp_decode_uint(&key); > + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN; > + } else { > + assert(mp_typeof(*key) == MP_INT); > + int64_t val = mp_decode_int(&key); > + ret = val > HINT_MAX ? HINT_UMAX : > + val < HINT_MIN ? 0 : > + (uint64_t)val - (uint64_t)HINT_MIN; > + } > + return HINT(HINT_TYPE_NUMBER, ret); > +} > + > +template<bool is_nullable> > +static uint64_t > +tuple_hint_int(const struct tuple *tuple, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_INTEGER); > + const char *field = tuple_field_by_part(tuple, key_def->parts); > + if (is_nullable && field == NULL) > + return HINT(HINT_TYPE_NUMBER, 0); > + return key_hint_int<false, is_nullable>(field, key_def); > +} > + > +template<bool is_optional, bool is_nullable> > +static uint64_t > +key_hint_number(const char *key, struct key_def *key_def) > +{ > + (void)key_def; > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_NUMBER || > + key_def->parts->type == FIELD_TYPE_SCALAR); > + if ((is_optional && key == NULL) || > + (is_nullable && mp_typeof(*key) == MP_NIL)) > + return HINT(HINT_TYPE_NUMBER, 0); > + uint64_t ret; > + switch (mp_typeof(*key)) { > + case MP_FLOAT: > + case MP_DOUBLE: { > + double f = mp_typeof(*key) == MP_FLOAT ? > + mp_decode_float(&key) : mp_decode_double(&key); > + if (!isfinite(f)) > + return HINT_INVALID; > + double val; > + (void)modf(f, &val); > + ret = val > (double)HINT_MAX ? HINT_UMAX : > + val < (double)HINT_MIN ? 0 : As I told you before, comparing doulbe with int64_t like this is incorrect: #include <stdio.h> #include <stdint.h> int main() { int64_t x = (1LL << 61) - 1; double y = x; printf("x is %lld (int64_t)\n", x); printf("y is %lf (double)\n", y); printf("y <= (double)x is %s\n", y <= (double)x ? "true" : "false"); printf("(int64_t)y is %lld\n", (int64_t)y); } prints x is 2305843009213693951 (int64_t) y is 2305843009213693952.000000 (double) y <= (double)x is true (int64_t)y is 2305843009213693952 Please fix and don't forget to add a test case. > + (uint64_t)val - (uint64_t)HINT_MIN; > + break; > + } > + case MP_INT: { > + int64_t val = (uint64_t)mp_decode_int(&key); > + ret = val > HINT_MAX ? HINT_UMAX : > + val < HINT_MIN ? 0 : > + (uint64_t)val - (uint64_t)HINT_MIN; > + break; > + } > + case MP_UINT: { > + uint64_t val = mp_decode_uint(&key); > + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN; > + break; > + } > + default: > + unreachable(); > + } > + return HINT(HINT_TYPE_NUMBER, ret); > +} > + > +template<bool is_nullable> > +static uint64_t > +tuple_hint_number(const struct tuple *tuple, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_NUMBER); > + const char *field = tuple_field_by_part(tuple, key_def->parts); > + if (is_nullable && field == NULL) > + return HINT(HINT_TYPE_NUMBER, 0); > + return key_hint_number<false, is_nullable>(field, key_def); > +} > + > +template<bool is_optional, bool is_nullable> > +static uint64_t > +key_hint_boolean(const char *key, struct key_def *key_def) > +{ > + (void)key_def; > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_BOOLEAN || > + key_def->parts->type == FIELD_TYPE_SCALAR); > + if ((is_optional && key == NULL) || > + (is_nullable && mp_typeof(*key) == MP_NIL)) > + return HINT(HINT_TYPE_BOOL, 0); > + return (uint64_t)mp_decode_bool(&key); > +} > + > +template<bool is_nullable> > +static uint64_t > +tuple_hint_boolean(const struct tuple *tuple, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_BOOLEAN); > + const char *field = tuple_field_by_part(tuple, key_def->parts); > + if (is_nullable && field == NULL) > + return HINT(HINT_TYPE_BOOL, 0); > + return key_hint_boolean<false, is_nullable>(field, key_def); > +} > + > +template<bool is_optional, bool is_nullable> > +static uint64_t > +key_hint_string(const char *key, struct key_def *key_def) > +{ > + (void)key_def; > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->coll == NULL); > + assert(key_def->parts->type == FIELD_TYPE_STRING || > + key_def->parts->type == FIELD_TYPE_SCALAR); > + if ((is_optional && key == NULL) || > + (is_nullable && mp_typeof(*key) == MP_NIL)) > + return HINT(HINT_TYPE_STRING, 0); > + assert(mp_typeof(*key) == MP_STR); > + uint32_t len; > + const unsigned char *str = > + (const unsigned char *)mp_decode_str(&key, &len); > + uint64_t result = 0; > + uint32_t process_len = MIN(len, 7); > + for (uint32_t i = 0; i < process_len; i++) { > + result <<= 8; > + result |= str[i]; > + } > + result <<= 8 * (7 - process_len); Please define an appropriate constant for 7. > + return HINT(HINT_TYPE_STRING, result); > +} > + > +template<bool is_nullable> > +static uint64_t > +tuple_hint_string(const struct tuple *tuple, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->coll == NULL); > + assert(key_def->parts->type == FIELD_TYPE_STRING); > + const char *field = tuple_field_by_part(tuple, key_def->parts); > + if (is_nullable && field == NULL) > + return HINT(HINT_TYPE_STRING, 0); > + return key_hint_string<false, is_nullable>(field, key_def); > +} > + > +template<bool is_optional, bool is_nullable> > +static uint64_t > +key_hint_string_coll(const char *key, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert((key_def->parts->type == FIELD_TYPE_STRING || > + key_def->parts->type == FIELD_TYPE_SCALAR) && > + key_def->parts->coll != NULL); > + if ((is_optional && key == NULL) || > + (is_nullable && mp_typeof(*key) == MP_NIL)) > + return HINT(HINT_TYPE_STRING, 0); > + assert(mp_typeof(*key) == MP_STR); > + uint32_t len; > + const char *str = mp_decode_str(&key, &len); > + return key_def->parts->coll->hint(str, len, key_def->parts->coll); > +} > + > +template<bool is_nullable> > +static uint64_t > +tuple_hint_string_coll(const struct tuple *tuple, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_STRING && > + key_def->parts->coll != NULL); > + const char *field = tuple_field_by_part(tuple, key_def->parts); > + if (is_nullable && field == NULL) > + return HINT(HINT_TYPE_STRING, 0); > + return key_hint_string_coll<false, is_nullable>(field, key_def); > +} > + > +template<bool is_optional, bool is_nullable> > +static uint64_t > +key_hint_scalar(const char *key, struct key_def *key_def) > +{ > + (void)key_def; > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_SCALAR); > + if ((is_optional && key == NULL) || > + (is_nullable && mp_typeof(*key) == MP_NIL)) > + return HINT(HINT_TYPE_BOOL, 0); > + switch (mp_typeof(*key)) { > + case MP_INT: > + case MP_UINT: > + case MP_FLOAT: > + case MP_DOUBLE: > + return key_hint_number<is_optional, is_nullable>(key, key_def); > + case MP_BOOL: > + return key_hint_boolean<is_optional, is_nullable>(key, key_def); > + case MP_STR: > + if (key_def->parts->coll == NULL) > + return key_hint_string<is_optional, > + is_nullable>(key, key_def); > + else > + return key_hint_string_coll<is_optional, > + is_nullable>(key, key_def); Please make sure this if() is resolved at compile time. Since we've chosen the path of using templates for optimization, we should stick to it till the end. > + default: > + return HINT_INVALID; Should be unreachable(). BTW you forgot MP_BIN. > + } > +} > + > +template<bool is_nullable> > +static uint64_t > +tuple_hint_scalar(const struct tuple *tuple, struct key_def *key_def) > +{ > + assert(key_part_is_nullable(key_def->parts) == is_nullable); > + assert(key_def->parts->type == FIELD_TYPE_SCALAR); > + const char *field = tuple_field_by_part(tuple, key_def->parts); > + if (is_nullable && field == NULL) > + return HINT(HINT_TYPE_BOOL, 0); > + return key_hint_scalar<false, is_nullable>(field, key_def); > +} > + > +void > +key_def_set_hint_func(struct key_def *def) > +{ > + def->key_hint = NULL; > + def->tuple_hint = NULL; > + bool is_nullable = key_part_is_nullable(def->parts); > + switch (def->parts->type) { > + case FIELD_TYPE_UNSIGNED: { > + def->key_hint = is_nullable ? key_hint_uint<true, true> : > + key_hint_uint<true, false>; > + def->tuple_hint = is_nullable ? tuple_hint_uint<true> : > + tuple_hint_uint<false>; > + break; > + } > + case FIELD_TYPE_INTEGER: { > + def->key_hint = is_nullable ? key_hint_int<true, true> : > + key_hint_int<true, false>; > + def->tuple_hint = is_nullable ? tuple_hint_int<true> : > + tuple_hint_int<false>; > + break; > + } > + case FIELD_TYPE_STRING: { > + if (def->parts->coll != NULL) { > + def->key_hint = > + is_nullable ? key_hint_string_coll<true, true> : > + key_hint_string_coll<true, false>; > + def->tuple_hint = > + is_nullable ? tuple_hint_string_coll<true> : > + tuple_hint_string_coll<false>; > + } else { > + def->key_hint = > + is_nullable ? key_hint_string<true, true> : > + key_hint_string<true, false>; > + def->tuple_hint = is_nullable ? tuple_hint_string<true> : > + tuple_hint_string<false>; > + } > + break; > + } > + case FIELD_TYPE_NUMBER: { > + def->key_hint = is_nullable ? key_hint_number<true, true> : > + key_hint_number<true, false>; > + def->tuple_hint = is_nullable ? tuple_hint_number<true> : > + tuple_hint_number<false>; > + break; > + } > + case FIELD_TYPE_BOOLEAN: { > + def->key_hint = is_nullable ? key_hint_boolean<true, true> : > + key_hint_boolean<true, false>; > + def->tuple_hint = is_nullable ? tuple_hint_boolean<true> : > + tuple_hint_boolean<false>; > + break; > + } > + case FIELD_TYPE_SCALAR: { > + def->key_hint = is_nullable ? key_hint_scalar<true, true> : > + key_hint_scalar<true, false>; > + def->tuple_hint = is_nullable ? tuple_hint_scalar<true> : > + tuple_hint_scalar<false>; > + break; > + } > + default: { > + break; > + } > + } > +} > + > /* {{{ tuple_compare */ > > /* > @@ -792,9 +1210,13 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def) > > template <bool is_nullable, bool has_optional_parts> > static int > -tuple_compare_sequential(const struct tuple *tuple_a, > - const struct tuple *tuple_b, key_def *key_def) > +tuple_compare_sequential_hinted(const struct tuple *tuple_a, > + uint64_t tuple_a_hint, const struct tuple *tuple_b, > + uint64_t tuple_b_hint, key_def *key_def) > { > + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); > + if (rc != 0) > + return rc; This should go after all the assertions below. > assert(!has_optional_parts || is_nullable); > assert(has_optional_parts == key_def->has_optional_parts); > assert(key_def_is_sequential(key_def)); > diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c > index afc15e809..3bd7c44f5 100644 > --- a/src/lib/coll/coll.c > +++ b/src/lib/coll/coll.c > @@ -133,6 +133,37 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry, > return s_len; > } > > +/** Get a compare hint of a binary collation. */ > +static uint64_t > +coll_bin_hint(const char *s, size_t s_len, struct coll *coll) > +{ > + (void)coll; > + uint64_t result = 0; > + uint32_t process_len = MIN(s_len, 7); > + for (uint32_t i = 0; i < process_len; i++) { > + result <<= 8; > + result |= ((unsigned char *)s)[i]; > + } > + result <<= 8 * (7 - process_len); I don't like that you copy-n-paste string hint computation logic here and in tuple_compare.cc. I think that it should be consolidated in tuple_compare.cc while coll->hint should return a hint in some raw format. I guess it could be just a wrapper around ucol_nextSortKeyPart. May be, we should even pick another name for it. > + return result; > +} > + > +/** Get a compare hint of a string using ICU collation. */ > +static uint64_t > +coll_icu_hint(const char *s, size_t s_len, struct coll *coll) > +{ > + assert(coll->type == COLL_TYPE_ICU); > + UCharIterator itr; > + uiter_setUTF8(&itr, s, s_len); > + uint8_t buf[7]; > + uint32_t state[2] = {0, 0}; > + UErrorCode status = U_ZERO_ERROR; > + int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state, buf, > + sizeof(buf), &status); > + assert(got >= 0 && got <= 7); > + return coll_bin_hint((const char *)buf, got, NULL); > +} ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 2/3] memtx: introduce tuple compare hint 2019-03-14 8:19 ` Vladimir Davydov @ 2019-03-15 10:20 ` Kirill Shcherbatov 2019-03-20 18:08 ` Vladimir Davydov 0 siblings, 1 reply; 11+ messages in thread From: Kirill Shcherbatov @ 2019-03-15 10:20 UTC (permalink / raw) To: tarantool-patches, Vladimir Davydov > Please refresh the comment. It isn't quite correct. I've stripped it a bit. > On the second thought, we might want to reduce the hint size or increase > it in future. Would it be better to introduce a helper type for hints so > that we could easily do this if we need to? > > typedef uint64_t tuple_hint_t; > > What do you think? Would it be worthwhile? Would it look OK? > > Sorry, I asked you to use uint64_t before. Looks to me now that > I was wrong... :-( tuple_hint_t name is not really good, I believe. At first, tuple_hint() is a routine that calculates such hint for tuple. I prefer hint_t type that would be ok even later with multikey index intriduction. > Hmm, (&a)->tuple is equivalent to (a).tuple, isn't it? I don't understand why, by it doesn't work in such way. > Accessing index->tree.arg looks ugly - it violates encapsulation. > Besides, right above you use index_def_cmp_def instead. I think we > better introduce a helper function for accessing index->tree.arg > and use it everywhere: > > memtx_tree_cmp_def(&index->tree) Now I use memtx_tree_index_cmp_def() routine that already present in code and works fine in such scenario. ============================================================== Implement functions for retrieving tuple hints for a particular key_def. Hint is an integer that can be used for tuple comparison optimization: if a hint of one tuple is less than a hint of another then the first tuple is definitely less than the second; only if hints are equal tuple_compare must be called for getting comparison result. Hints are calculated using only the first part of key_def. Close #3961 --- src/box/key_def.h | 128 +++++++++++++ src/box/memtx_tree.c | 40 +++- src/box/tuple_compare.cc | 394 +++++++++++++++++++++++++++++++++++++-- src/lib/coll/coll.c | 29 +++ src/lib/coll/coll.h | 5 + test/engine/ddl.result | 180 ++++++++++++++++++ test/engine/ddl.test.lua | 49 +++++ 7 files changed, 800 insertions(+), 25 deletions(-) diff --git a/src/box/key_def.h b/src/box/key_def.h index 7c9f8c6b5..174790f72 100644 --- a/src/box/key_def.h +++ b/src/box/key_def.h @@ -116,6 +116,41 @@ struct key_part { struct key_def; struct tuple; +/** + * Tuple comparison hint h(t) is such a function of tuple t that + * the following conditions always hold for any pair of tuples + * t1 and t2: + * if h(t1) < h(t2) then t1 < t2; + * if h(t1) > h(t2) then t1 > t2; + * if h(t1) == h(t2) then t1 may or may not be equal to t2. + * These rules mean that instead of direct tuple vs tuple + * (or tuple vs key) comparison one may compare theirs + * hints first; and only if theirs hints equal compare + * the tuples themselves. + * + * The comparison hint has the following structure: + * uint64_t: [ HINT_TYPE | HINT_VALUE ] + * 64 60 0 bit + * + * To make the values of hints for numeric types(unsigned, + * integer, float, double) compatible, define a linear mapping + * of the segment f(x) = k*x + b = x + max + 1: + * f([-max - 1, max]) = [0, 2 * max + 1]. + * It is easy to make sure that there is a < b <=> f (a) < f(b). + * The max value is determined by the number of bits allocated + * for storage of HINT_VALUE. + */ +typedef uint64_t hint_t; + +#define HINT_VALUE_BITS 60 +#define HINT_MAX (((int64_t)1 << (HINT_VALUE_BITS - 1)) - 1) +#define HINT_MIN (-HINT_MAX - 1) + +/** + * The reserved value to use when comparison hint is undefined. + */ +#define HINT_NONE ((hint_t)UINT64_MAX) + /** * Get is_nullable property of key_part. * @param key_part for which attribute is being fetched @@ -137,6 +172,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a, typedef int (*tuple_compare_t)(const struct tuple *tuple_a, const struct tuple *tuple_b, struct key_def *key_def); +/** @copydoc tuple_compare_with_key_hinted() */ +typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple, + hint_t tuple_hint, + const char *key, + uint32_t part_count, + hint_t key_hint, + struct key_def *key_def); +/** @copydoc tuple_compare_hinted() */ +typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a, + hint_t tuple_a_hint, + const struct tuple *tuple_b, + hint_t tuple_b_hint, + struct key_def *key_def); /** @copydoc tuple_extract_key() */ typedef char *(*tuple_extract_key_t)(const struct tuple *tuple, struct key_def *key_def, @@ -152,6 +200,11 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple, /** @copydoc key_hash() */ typedef uint32_t (*key_hash_t)(const char *key, struct key_def *key_def); +/** @copydoc tuple_hint() */ +typedef hint_t (*tuple_hint_t)(const struct tuple *tuple, + struct key_def *key_def); +/** @copydoc key_hint() */ +typedef hint_t (*key_hint_t)(const char *key, struct key_def *key_def); /* Definition of a multipart key. */ struct key_def { @@ -159,6 +212,10 @@ struct key_def { tuple_compare_t tuple_compare; /** @see tuple_compare_with_key() */ tuple_compare_with_key_t tuple_compare_with_key; + /** @see tuple_compare_with_key_hinted() */ + tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted; + /** @see tuple_compare_hinted() */ + tuple_compare_hinted_t tuple_compare_hinted; /** @see tuple_extract_key() */ tuple_extract_key_t tuple_extract_key; /** @see tuple_extract_key_raw() */ @@ -167,6 +224,10 @@ struct key_def { tuple_hash_t tuple_hash; /** @see key_hash() */ key_hash_t key_hash; + /** @see tuple_hint() */ + tuple_hint_t tuple_hint; + /** @see key_hint() */ + key_hint_t key_hint; /** * Minimal part count which always is unique. For example, * if a secondary index is unique, then @@ -575,6 +636,49 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key, return key_def->tuple_compare_with_key(tuple, key, part_count, key_def); } +/** + * Compare tuples using the key definition and comparison hints. + * @param tuple_a First tuple. + * @param tuple_a_hint Comparison hint of @a tuple_a. + * @param tuple_b Second tuple. + * @param tuple_b_hint Comparison hint of @a tuple_b. + * @param key_def Key definition. + * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b). + * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b). + * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b). + */ +static inline int +tuple_compare_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint, + const struct tuple *tuple_b, hint_t tuple_b_hint, + struct key_def *key_def) +{ + return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b, + tuple_b_hint, key_def); +} + +/** + * Compare tuple with key using the key definition and + * comparison hints. + * @param tuple Tuple. + * @param tuple_hint Comparison hint of @a tuple. + * @param key Key parts without MessagePack array header. + * @param part_count The number of parts in @a key. + * @param key_hint t Comparison hint of @a key. + * @param key_def Key definition. + * @retval 0 if key_fields(tuple) == parts(key). + * @retval <0 if key_fields(tuple) < parts(key). + * @retval >0 if key_fields(tuple) > parts(key). + */ +static inline int +tuple_compare_with_key_hinted(const struct tuple *tuple, hint_t tuple_hint, + const char *key, uint32_t part_count, + hint_t key_hint, struct key_def *key_def) +{ + return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key, + part_count, key_hint, + key_def); +} + /** * Compute hash of a tuple field. * @param ph1 - pointer to running hash @@ -628,6 +732,30 @@ key_hash(const char *key, struct key_def *key_def) return key_def->key_hash(key, key_def); } + /* + * Get comparison hint for a @a tuple. + * @param tuple Tuple to compute the hint for. + * @param key_def Key definition used for tuple comparison. + * @return The hint. + */ +static inline hint_t +tuple_hint(const struct tuple *tuple, struct key_def *key_def) +{ + return key_def->tuple_hint(tuple, key_def); +} + +/** + * Get a comparison hint of a @a key. + * @param key Key to compute the hint for. + * @param key_def Key definition used for tuple comparison. + * @return The hint. + */ +static inline hint_t +key_hint(const char *key, struct key_def *key_def) +{ + return key_def->key_hint(key, key_def); +} + #if defined(__cplusplus) } /* extern "C" */ #endif /* defined(__cplusplus) */ diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c index d5b42efb6..c2c8a19e7 100644 --- a/src/box/memtx_tree.c +++ b/src/box/memtx_tree.c @@ -47,6 +47,8 @@ struct memtx_tree_key_data { const char *key; /** Number of msgpacked search fields. */ uint32_t part_count; + /** Comparison hint, see tuple_hint(). */ + hint_t hint; }; /** @@ -55,6 +57,8 @@ struct memtx_tree_key_data { struct memtx_tree_data { /* Tuple that this node is represents. */ struct tuple *tuple; + /** Comparison hint, see key_hint(). */ + hint_t hint; }; /** @@ -69,16 +73,18 @@ static bool memtx_tree_data_identical(const struct memtx_tree_data *a, const struct memtx_tree_data *b) { - return a->tuple == b->tuple; + return a->tuple == b->tuple && a->hint == b->hint; } #define BPS_TREE_NAME memtx_tree #define BPS_TREE_BLOCK_SIZE (512) #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE #define BPS_TREE_COMPARE(a, b, arg)\ - tuple_compare((&a)->tuple, (&b)->tuple, arg) + tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\ + (&b)->hint, arg) #define BPS_TREE_COMPARE_KEY(a, b, arg)\ - tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg) + tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\ + (b)->part_count, (b)->hint, arg) #define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b) #define bps_tree_elem_t struct memtx_tree_data #define bps_tree_key_t struct memtx_tree_key_data * @@ -113,7 +119,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c) const struct memtx_tree_data *data_a = a; const struct memtx_tree_data *data_b = b; struct key_def *key_def = c; - return tuple_compare(data_a->tuple, data_b->tuple, key_def); + return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple, + data_b->hint, key_def); } /* {{{ MemtxTree Iterators ****************************************/ @@ -235,9 +242,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret) memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator); /* Use user key def to save a few loops. */ if (res == NULL || - tuple_compare_with_key(res->tuple, it->key_data.key, - it->key_data.part_count, - it->index_def->key_def) != 0) { + tuple_compare_with_key_hinted(res->tuple, res->hint, + it->key_data.key, + it->key_data.part_count, + it->key_data.hint, + it->index_def->key_def) != 0) { iterator->next = tree_iterator_dummie; it->current.tuple = NULL; *ret = NULL; @@ -266,9 +275,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret) memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator); /* Use user key def to save a few loops. */ if (res == NULL || - tuple_compare_with_key(res->tuple, it->key_data.key, - it->key_data.part_count, - it->index_def->key_def) != 0) { + tuple_compare_with_key_hinted(res->tuple, res->hint, + it->key_data.key, + it->key_data.part_count, + it->key_data.hint, + it->index_def->key_def) != 0) { iterator->next = tree_iterator_dummie; it->current.tuple = NULL; *ret = NULL; @@ -516,9 +527,11 @@ memtx_tree_index_get(struct index *base, const char *key, assert(base->def->opts.is_unique && part_count == base->def->key_def->part_count); struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_index_cmp_def(index); struct memtx_tree_key_data key_data; key_data.key = key; key_data.part_count = part_count; + key_data.hint = key_hint(key, cmp_def); struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data); *result = res != NULL ? res->tuple : NULL; return 0; @@ -530,9 +543,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, struct tuple **result) { struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *key_def = memtx_tree_index_cmp_def(index); if (new_tuple) { struct memtx_tree_data new_data; new_data.tuple = new_tuple; + new_data.hint = tuple_hint(new_tuple, key_def); struct memtx_tree_data dup_data; dup_data.tuple = NULL; @@ -565,6 +580,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, if (old_tuple) { struct memtx_tree_data old_data; old_data.tuple = old_tuple; + old_data.hint = tuple_hint(old_tuple, key_def); memtx_tree_delete(&index->tree, old_data); } *result = old_tuple; @@ -607,6 +623,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type, it->type = type; it->key_data.key = key; it->key_data.part_count = part_count; + struct key_def *cmp_def = memtx_tree_index_cmp_def(index); + it->key_data.hint = key_hint(key, cmp_def); it->index_def = base->def; it->tree = &index->tree; it->tree_iterator = memtx_tree_invalid_iterator(); @@ -671,6 +689,8 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple) struct memtx_tree_data *elem = &index->build_array[index->build_array_size++]; elem->tuple = tuple; + struct key_def *key_def = memtx_tree_index_cmp_def(index); + elem->hint = tuple_hint(tuple, key_def); return 0; } diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index 7fe1766a8..54a977926 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -34,6 +34,14 @@ #include "trivia/util.h" /* NOINLINE */ #include <math.h> +/** + * Perform hints comparison according hint_t definition. + * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a; + * When tuple_hints are incompatible or equal, return 0. + */ +static int +hint_cmp(hint_t hint_a, hint_t hint_b); + /* {{{ tuple_compare */ /* @@ -427,13 +435,17 @@ tuple_compare_field_with_hint(const char *field_a, enum mp_type a_type, template<bool is_nullable, bool has_optional_parts, bool has_json_paths> static inline int -tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, - struct key_def *key_def) +tuple_compare_slowpath_hinted(const struct tuple *tuple_a, + hint_t tuple_a_hint, const struct tuple *tuple_b, + hint_t tuple_b_hint, struct key_def *key_def) { assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || is_nullable); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + int rc = 0; + if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0) + return rc; struct key_part *part = key_def->parts; const char *tuple_a_raw = tuple_data(tuple_a); const char *tuple_b_raw = tuple_data(tuple_b); @@ -469,7 +481,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, struct key_part *end; const char *field_a, *field_b; enum mp_type a_type, b_type; - int rc; if (is_nullable) end = part + key_def->unique_part_count; else @@ -559,8 +570,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, template<bool is_nullable, bool has_optional_parts, bool has_json_paths> static inline int -tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, - uint32_t part_count, struct key_def *key_def) +tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, + struct key_def *key_def) +{ + return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts, + has_json_paths>(tuple_a, HINT_NONE, tuple_b, HINT_NONE, + key_def); +} + +template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +static inline int +tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple, + hint_t tuple_hint, const char *key, uint32_t part_count, + hint_t key_hint, struct key_def *key_def) { assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || is_nullable); @@ -568,6 +590,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, assert(has_optional_parts == key_def->has_optional_parts); assert(key != NULL || part_count == 0); assert(part_count <= key_def->part_count); + int rc = 0; + if ((rc = hint_cmp(tuple_hint, key_hint)) != 0) + return rc; struct key_part *part = key_def->parts; struct tuple_format *format = tuple_format(tuple); const char *tuple_raw = tuple_data(tuple); @@ -603,7 +628,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, } struct key_part *end = part + part_count; - int rc; for (; part < end; ++part, mp_next(&key)) { const char *field; if (has_json_paths) { @@ -642,6 +666,17 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, return 0; } +template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +static inline int +tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, + uint32_t part_count, struct key_def *key_def) +{ + return tuple_compare_with_key_slowpath_hinted<is_nullable, + has_optional_parts, has_json_paths>(tuple, + HINT_NONE, key, part_count, + HINT_NONE, key_def); +} + template<bool is_nullable> static inline int key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count, @@ -700,13 +735,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count, template<bool is_nullable, bool has_optional_parts> static inline int -tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, - uint32_t part_count, struct key_def *key_def) +tuple_compare_with_key_sequential_hinted(const struct tuple *tuple, + hint_t tuple_hint, const char *key, uint32_t part_count, + hint_t key_hint, struct key_def *key_def) { assert(!has_optional_parts || is_nullable); assert(key_def_is_sequential(key_def)); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; const char *tuple_key = tuple_data(tuple); uint32_t field_count = mp_decode_array(&tuple_key); uint32_t cmp_part_count; @@ -716,8 +755,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, assert(field_count >= part_count); cmp_part_count = part_count; } - int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count, - key_def); + rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count, + key_def); if (!has_optional_parts || rc != 0) return rc; /* @@ -739,6 +778,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, return 0; } +template<bool is_nullable, bool has_optional_parts> +static inline int +tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, + uint32_t part_count, struct key_def *key_def) +{ + return tuple_compare_with_key_sequential_hinted<is_nullable, + has_optional_parts>(tuple, HINT_NONE, key, + part_count, HINT_NONE, key_def); +} + int key_compare(const char *key_a, const char *key_b, struct key_def *key_def) { @@ -759,13 +808,17 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def) template <bool is_nullable, bool has_optional_parts> static int -tuple_compare_sequential(const struct tuple *tuple_a, - const struct tuple *tuple_b, key_def *key_def) +tuple_compare_sequential_hinted(const struct tuple *tuple_a, + hint_t tuple_a_hint, const struct tuple *tuple_b, + hint_t tuple_b_hint, key_def *key_def) { assert(!has_optional_parts || is_nullable); assert(has_optional_parts == key_def->has_optional_parts); assert(key_def_is_sequential(key_def)); assert(is_nullable == key_def->is_nullable); + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; const char *key_a = tuple_data(tuple_a); uint32_t fc_a = mp_decode_array(&key_a); const char *key_b = tuple_data(tuple_b); @@ -779,7 +832,6 @@ tuple_compare_sequential(const struct tuple *tuple_a, bool was_null_met = false; struct key_part *part = key_def->parts; struct key_part *end = part + key_def->unique_part_count; - int rc; uint32_t i = 0; for (; part < end; ++part, ++i) { enum mp_type a_type, b_type; @@ -826,6 +878,16 @@ tuple_compare_sequential(const struct tuple *tuple_a, return 0; } +template <bool is_nullable, bool has_optional_parts> +static int +tuple_compare_sequential(const struct tuple *tuple_a, + const struct tuple *tuple_b, key_def *key_def) +{ + return tuple_compare_sequential_hinted<is_nullable, + has_optional_parts>(tuple_a, HINT_NONE, tuple_b, + HINT_NONE, key_def); +} + template <int TYPE> static inline int field_compare(const char **field_a, const char **field_b); @@ -956,6 +1018,18 @@ struct TupleCompare compare(tuple_a, tuple_b, format_a, format_b, field_a, field_b); } + + static int compare_hinted(const struct tuple *tuple_a, + hint_t tuple_a_hint, + const struct tuple *tuple_b, + hint_t tuple_b_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; + return compare(tuple_a, tuple_b, key_def); + } }; template <int TYPE, int ...MORE_TYPES> @@ -973,15 +1047,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> { return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b, format_a, format_b, field_a, field_b); } + + static int compare_hinted(const struct tuple *tuple_a, + hint_t tuple_a_hint, + const struct tuple *tuple_b, + hint_t tuple_b_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; + return compare(tuple_a, tuple_b, key_def); + } }; } /* end of anonymous namespace */ struct comparator_signature { tuple_compare_t f; + tuple_compare_hinted_t f_hinted; uint32_t p[64]; }; #define COMPARATOR(...) \ - { TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } }, + { TupleCompare<__VA_ARGS__>::compare, \ + TupleCompare<__VA_ARGS__>::compare_hinted, \ + { __VA_ARGS__, UINT32_MAX } }, /** * field1 no, field1 type, field2 no, field2 type, ... @@ -1133,6 +1222,17 @@ struct TupleCompareWithKey compare(tuple, key, part_count, key_def, format, field); } + + static int + compare_hinted(const struct tuple *tuple, hint_t tuple_hint, + const char *key, uint32_t part_count, hint_t key_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; + return compare(tuple, key, part_count, key_def); + } }; template <int TYPE, int ...MORE_TYPES> @@ -1153,6 +1253,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...> compare(tuple, key, part_count, key_def, format, field); } + + static int + compare_hinted(const struct tuple *tuple, hint_t tuple_hint, + const char *key, uint32_t part_count, hint_t key_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; + return compare(tuple, key, part_count, key_def); + } }; } /* end of anonymous namespace */ @@ -1160,11 +1271,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...> struct comparator_with_key_signature { tuple_compare_with_key_t f; + tuple_compare_with_key_hinted_t f_hinted; uint32_t p[64]; }; #define KEY_COMPARATOR(...) \ - { TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } }, + { TupleCompareWithKey<0, __VA_ARGS__>::compare, \ + TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \ + { __VA_ARGS__ } }, static const comparator_with_key_signature precompiled_cmp_wk_arr[] = { KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED) @@ -1187,16 +1301,22 @@ static const comparator_with_key_signature precompiled_cmp_wk_arr[] = { #define COMPARE_SLOWPATH(...) \ {tuple_compare_slowpath<__VA_ARGS__>, \ tuple_compare_with_key_slowpath<__VA_ARGS__>, \ + tuple_compare_slowpath_hinted<__VA_ARGS__>, \ + tuple_compare_with_key_slowpath_hinted<__VA_ARGS__>, \ __VA_ARGS__, false} #define COMPARE_SEQUENTIAL(...) \ {tuple_compare_sequential<__VA_ARGS__>, \ tuple_compare_with_key_sequential<__VA_ARGS__>, \ + tuple_compare_sequential_hinted<__VA_ARGS__>, \ + tuple_compare_with_key_sequential_hinted<__VA_ARGS__>, \ __VA_ARGS__, false, true} static struct { tuple_compare_t tuple_compare; tuple_compare_with_key_t tuple_compare_with_key; + tuple_compare_hinted_t tuple_compare_hinted; + tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted; bool is_nullable; bool has_optional_parts; bool has_json_paths; @@ -1221,11 +1341,236 @@ static struct { /* }}} tuple_compare_with_key */ +static inline hint_t +tuple_hint_new(enum mp_class type, uint64_t val) +{ + assert(((uint64_t)type >> (sizeof(hint_t) * CHAR_BIT - HINT_VALUE_BITS)) == 0); + assert(val >> HINT_VALUE_BITS == 0); + return (hint_t)(((uint64_t)type << HINT_VALUE_BITS) | val); +} + +static int +hint_cmp(hint_t hint_a, hint_t hint_b) +{ + if (hint_a != HINT_NONE && hint_b != HINT_NONE && hint_a != hint_b) + return hint_a < hint_b ? -1 : 1; + else + return 0; +} + +static hint_t +field_hint_unsigned(const char *field) +{ + assert(mp_typeof(*field) == MP_UINT); + uint64_t val = mp_decode_uint(&field); + if (val > HINT_MAX) + val = HINT_MAX; + uint64_t ret = val - (uint64_t)HINT_MIN; + return tuple_hint_new(MP_CLASS_NUMBER, ret); +} + +static hint_t +field_hint_integer(const char *field) +{ + switch (mp_typeof(*field)) { + case MP_INT: { + int64_t val = mp_decode_int(&field); + if (val > HINT_MAX) + val = HINT_MAX; + else if (val < HINT_MIN) + val = HINT_MIN; + uint64_t ret = (uint64_t)val - (uint64_t)HINT_MIN; + return tuple_hint_new(MP_CLASS_NUMBER, ret); + } + case MP_UINT: + return field_hint_unsigned(field); + default: + unreachable(); + } + return HINT_NONE; +} + +static hint_t +field_hint_number(const char *field) +{ + enum mp_type type = mp_typeof(*field); + switch (type) { + case MP_FLOAT: + case MP_DOUBLE: { + double f = type == MP_FLOAT ? + mp_decode_float(&field) : mp_decode_double(&field); + if (!isfinite(f)) + return HINT_NONE; + uint64_t ret; + if (f > exp2(HINT_VALUE_BITS - 1)) + ret = (uint64_t)HINT_MAX - (uint64_t)HINT_MIN; + else if (f < -exp2(HINT_VALUE_BITS - 1)) + ret = 0; + else { + double val; + (void)modf(f, &val); + ret = (uint64_t)val - (uint64_t)HINT_MIN; + } + return tuple_hint_new(MP_CLASS_NUMBER, ret); + } + case MP_UINT: + return field_hint_unsigned(field); + case MP_INT: + return field_hint_integer(field); + default: + unreachable(); + } + return HINT_NONE; +} + +static hint_t +field_hint_boolean(const char *field) +{ + assert(mp_typeof(*field) == MP_BOOL); + return tuple_hint_new(MP_CLASS_BOOL, mp_decode_bool(&field) ? 1 : 0); +} + +static uint64_t +memory_buffer_hint_val(const unsigned char *raw, uint32_t size) +{ + int process_len = MIN((int)size, 7); + uint64_t result = 0; + for (int i = 0; i < process_len; i++) { + result <<= CHAR_BIT; + result |= raw[i]; + } + result <<= CHAR_BIT * (7 - process_len); + return result; +} + +template <bool has_collation> +static hint_t +field_hint_string(const char *field, struct coll *coll) +{ + assert(has_collation == (coll != NULL)); + assert(mp_typeof(*field) == MP_STR); + char buff[7]; + uint32_t len; + const char *str = mp_decode_str(&field, &len); + if (has_collation) { + len = coll->get_sort_key(str, len, buff, sizeof(buff), coll); + str = buff; + } + uint64_t ret = memory_buffer_hint_val((const unsigned char *)str, len); + return tuple_hint_new(MP_CLASS_STR, ret); +} + +template <bool has_collation> +static hint_t +field_hint_scalar(const char *field, struct coll *coll) +{ + switch (mp_classof(mp_typeof(*field))) { + case MP_CLASS_NUMBER: + return field_hint_number(field); + case MP_CLASS_BOOL: + return field_hint_boolean(field); + case MP_CLASS_STR: + return field_hint_string<has_collation>(field, coll); + case MP_CLASS_BIN: { + assert(coll == NULL); + uint32_t size; + const unsigned char *raw = + (const unsigned char *)mp_decode_bin(&field, &size); + uint64_t ret = memory_buffer_hint_val(raw, size); + return tuple_hint_new(MP_CLASS_BIN, ret); + } + default: + unreachable(); + } + return HINT_NONE; +} + +template <enum field_type type, bool is_nullable, bool has_collation> +static hint_t +field_hint(const char *field, struct key_def *key_def) +{ + assert(type == key_def->parts->type); + assert(key_part_is_nullable(key_def->parts) == is_nullable); + assert(has_collation == (key_def->parts->coll != NULL)); + switch (type) { + case FIELD_TYPE_UNSIGNED: + return field_hint_unsigned(field); + case FIELD_TYPE_INTEGER: + return field_hint_integer(field); + case FIELD_TYPE_NUMBER: + return field_hint_number(field); + case FIELD_TYPE_BOOLEAN: + return field_hint_boolean(field); + case FIELD_TYPE_STRING: + return field_hint_string<has_collation>(field, + key_def->parts->coll); + case FIELD_TYPE_SCALAR: + return field_hint_scalar<has_collation>(field, + key_def->parts->coll); + default: + unreachable(); + } + return HINT_NONE; +} + +template <enum field_type type, bool is_nullable, bool has_collation> +static hint_t +key_hint(const char *key, struct key_def *key_def) +{ + if (key == NULL || (is_nullable && mp_typeof(*key) == MP_NIL)) + return tuple_hint_new(MP_CLASS_NIL, 0); + return field_hint<type, is_nullable, has_collation>(key, key_def); +} + +template <enum field_type type, bool is_nullable, bool has_collation> +static hint_t +tuple_hint(const struct tuple *tuple, struct key_def *key_def) +{ + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && (field == NULL || mp_typeof(*field) == MP_NIL)) + return tuple_hint_new(MP_CLASS_NIL, 0); + return field_hint<type, is_nullable, has_collation>(field, key_def); +} + +#define HINT(...) { \ + tuple_hint<__VA_ARGS__>, key_hint<__VA_ARGS__>, __VA_ARGS__ } + +static const struct { + tuple_hint_t tuple_hint; + key_hint_t key_hint; + enum field_type type; + bool is_nullable; + bool has_collation; +} hint_arr[] = { + HINT(FIELD_TYPE_UNSIGNED, false, false), + HINT(FIELD_TYPE_STRING, false, false), + HINT(FIELD_TYPE_NUMBER, false, false), + HINT(FIELD_TYPE_INTEGER, false, false), + HINT(FIELD_TYPE_BOOLEAN, false, false), + HINT(FIELD_TYPE_SCALAR, false, false), + HINT(FIELD_TYPE_UNSIGNED, true, false), + HINT(FIELD_TYPE_STRING, true, false), + HINT(FIELD_TYPE_NUMBER, true, false), + HINT(FIELD_TYPE_INTEGER, true, false), + HINT(FIELD_TYPE_BOOLEAN, true, false), + HINT(FIELD_TYPE_SCALAR, true, false), + HINT(FIELD_TYPE_STRING, false, true), + HINT(FIELD_TYPE_SCALAR, false, true), + HINT(FIELD_TYPE_STRING, true, true), + HINT(FIELD_TYPE_SCALAR, true, true) +}; + +#undef HINT + void key_def_set_compare_func(struct key_def *def) { def->tuple_compare = NULL; def->tuple_compare_with_key = NULL; + def->tuple_compare_hinted = NULL; + def->tuple_compare_with_key_hinted = NULL; + def->key_hint = NULL; + def->tuple_hint = NULL; if (!def->is_nullable && !key_def_has_collation(def) && !def->has_json_paths) { /* Precalculated comparators don't use collation */ @@ -1242,6 +1587,8 @@ key_def_set_compare_func(struct key_def *def) if (i == def->part_count && precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) { def->tuple_compare = precompiled_cmp_arr[k].f; + def->tuple_compare_hinted = + precompiled_cmp_arr[k].f_hinted; break; } } @@ -1258,6 +1605,8 @@ key_def_set_compare_func(struct key_def *def) if (i == def->part_count) { def->tuple_compare_with_key = precompiled_cmp_wk_arr[k].f; + def->tuple_compare_with_key_hinted = + precompiled_cmp_wk_arr[k].f_hinted; break; } } @@ -1274,11 +1623,16 @@ key_def_set_compare_func(struct key_def *def) if (def->tuple_compare == NULL) { def->tuple_compare = cmp_arr[k].tuple_compare; + def->tuple_compare_hinted = + cmp_arr[k].tuple_compare_hinted; } if (def->tuple_compare_with_key == NULL) { def->tuple_compare_with_key = cmp_arr[k]. tuple_compare_with_key; + def->tuple_compare_with_key_hinted = + cmp_arr[k]. + tuple_compare_with_key_hinted; } break; } @@ -1286,4 +1640,14 @@ key_def_set_compare_func(struct key_def *def) } assert(def->tuple_compare != NULL && def->tuple_compare_with_key != NULL); + for (uint32_t k = 0; k < sizeof(hint_arr) / sizeof(hint_arr[0]); k++) { + if (def->parts->type == hint_arr[k].type && + key_part_is_nullable(def->parts) == + hint_arr[k].is_nullable && + (def->parts->coll != NULL) == hint_arr[k].has_collation) { + def->key_hint = hint_arr[k].key_hint; + def->tuple_hint = hint_arr[k].tuple_hint; + break; + } + } } diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c index afc15e809..d110b156b 100644 --- a/src/lib/coll/coll.c +++ b/src/lib/coll/coll.c @@ -133,6 +133,33 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry, return s_len; } +int +coll_icu_get_sort_key(const char *s, size_t s_len, char *buff, size_t buff_len, + struct coll *coll) +{ + assert(coll->type == COLL_TYPE_ICU); + UCharIterator itr; + uiter_setUTF8(&itr, s, s_len); + uint32_t state[2] = {0, 0}; + UErrorCode status = U_ZERO_ERROR; + int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state, + (uint8_t *)buff, buff_len, &status); + assert(got >= 0 && (size_t)got <= buff_len); + assert(U_SUCCESS(status)); + return got; +} + +int +coll_bin_get_sort_key(const char *s, size_t s_len, char *buff, size_t buff_len, + struct coll *coll) +{ + (void)coll; + assert(coll->type == COLL_TYPE_BINARY); + int to_copy = (int)MIN(s_len, buff_len); + memcpy(buff, s, to_copy); + return to_copy; +} + /** * Set up ICU collator and init cmp and hash members of collation. * @param coll Collation to set up. @@ -242,6 +269,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def) } coll->cmp = coll_icu_cmp; coll->hash = coll_icu_hash; + coll->get_sort_key = coll_icu_get_sort_key; return 0; } @@ -356,6 +384,7 @@ coll_new(const struct coll_def *def) coll->collator = NULL; coll->cmp = coll_bin_cmp; coll->hash = coll_bin_hash; + coll->get_sort_key = coll_bin_get_sort_key; break; default: unreachable(); diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h index 8c9f94293..a68668ec5 100644 --- a/src/lib/coll/coll.h +++ b/src/lib/coll/coll.h @@ -48,6 +48,9 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t, typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry, struct coll *coll); +typedef int (*coll_get_sort_key_f)(const char *s, size_t s_len, char *buff, + size_t buff_len, struct coll *coll); + struct UCollator; /** @@ -62,6 +65,8 @@ struct coll { /** String comparator. */ coll_cmp_f cmp; coll_hash_f hash; + /** Fill memory buffer with data. */ + coll_get_sort_key_f get_sort_key; /** Reference counter. */ int refs; /** diff --git a/test/engine/ddl.result b/test/engine/ddl.result index 8d34d5ef4..1791138c6 100644 --- a/test/engine/ddl.result +++ b/test/engine/ddl.result @@ -1935,6 +1935,186 @@ s:drop() --- ... -- +-- gh-3961: Introduce tuple comparison hints +-- +s = box.schema.space.create('test', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +i1 = s:create_index('i1', {unique = true, parts = {2, 'scalar'}}) +--- +... +bdbl = (2.1)^62; +--- +... +nbdbl = -bdbl +--- +... +_ = s:insert{0, bdbl} +--- +... +_ = s:insert{1, nbdbl} +--- +... +_ = s:insert{2, -2.3} +--- +... +_ = s:insert{3, -2} +--- +... +_ = s:insert{4, -1.3} +--- +... +_ = s:insert{5, -1} +--- +... +_ = s:insert{6, 0} +--- +... +_ = s:insert{7, 1} +--- +... +_ = s:insert{8, 1.3} +--- +... +_ = s:insert{9, 2} +--- +... +_ = s:insert{10, 2.3} +--- +... +i1:select() +--- +- - [1, -9.4972150816945e+19] + - [2, -2.3] + - [3, -2] + - [4, -1.3] + - [5, -1] + - [6, 0] + - [7, 1] + - [8, 1.3] + - [9, 2] + - [10, 2.3] + - [0, 9.4972150816945e+19] +... +i1:alter{parts = {2, 'number'}} +--- +... +_ = s:insert{11, -1.2} +--- +... +_ = s:insert{12, 1.2} +--- +... +_ = s:insert{13, -5} +--- +... +_ = s:insert{14, 5} +--- +... +i1:select() +--- +- - [1, -9.4972150816945e+19] + - [13, -5] + - [2, -2.3] + - [3, -2] + - [4, -1.3] + - [11, -1.2] + - [5, -1] + - [6, 0] + - [7, 1] + - [12, 1.2] + - [8, 1.3] + - [9, 2] + - [10, 2.3] + - [14, 5] + - [0, 9.4972150816945e+19] +... +_ = i1:delete({-2.3}) +--- +... +_ = i1:delete({-1.3}) +--- +... +_ = i1:delete({1.3}) +--- +... +_ = i1:delete({2.3}) +--- +... +_ = i1:delete({-1.2}) +--- +... +_ = i1:delete({1.2}) +--- +... +_ = i1:delete({bdbl}) +--- +... +_ = i1:delete({nbdbl}) +--- +... +i1:alter{parts = {2, 'integer'}} +--- +... +_ = s:insert{15, -4} +--- +... +_ = s:insert{16, 4} +--- +... +i1:select() +--- +- - [13, -5] + - [15, -4] + - [3, -2] + - [5, -1] + - [6, 0] + - [7, 1] + - [9, 2] + - [16, 4] + - [14, 5] +... +_ = i1:delete({-5}) +--- +... +_ = i1:delete({-4}) +--- +... +_ = i1:delete({-2}) +--- +... +_ = i1:delete({-1}) +--- +... +i1:alter{parts = {2, 'unsigned'}} +--- +... +_ = s:insert({17, 10}) +--- +... +i1:alter{parts = {2, 'scalar'}} +--- +... +_ = s:insert({18, 1.1}) +--- +... +i1:select() +--- +- - [6, 0] + - [7, 1] + - [18, 1.1] + - [9, 2] + - [16, 4] + - [14, 5] + - [17, 10] +... +s:drop() +--- +... +-- -- Creating/altering a secondary index of a non-empty space. -- s = box.schema.space.create('test', {engine = engine}) diff --git a/test/engine/ddl.test.lua b/test/engine/ddl.test.lua index cdaf7a5bf..a73075498 100644 --- a/test/engine/ddl.test.lua +++ b/test/engine/ddl.test.lua @@ -710,6 +710,55 @@ _ = s:create_index('sk', {parts = {2, 'unsigned'}}) c:get() s:drop() +-- +-- gh-3961: Introduce tuple comparison hints +-- +s = box.schema.space.create('test', {engine = engine}) +pk = s:create_index('pk') +i1 = s:create_index('i1', {unique = true, parts = {2, 'scalar'}}) +bdbl = (2.1)^62; +nbdbl = -bdbl +_ = s:insert{0, bdbl} +_ = s:insert{1, nbdbl} +_ = s:insert{2, -2.3} +_ = s:insert{3, -2} +_ = s:insert{4, -1.3} +_ = s:insert{5, -1} +_ = s:insert{6, 0} +_ = s:insert{7, 1} +_ = s:insert{8, 1.3} +_ = s:insert{9, 2} +_ = s:insert{10, 2.3} +i1:select() +i1:alter{parts = {2, 'number'}} +_ = s:insert{11, -1.2} +_ = s:insert{12, 1.2} +_ = s:insert{13, -5} +_ = s:insert{14, 5} +i1:select() +_ = i1:delete({-2.3}) +_ = i1:delete({-1.3}) +_ = i1:delete({1.3}) +_ = i1:delete({2.3}) +_ = i1:delete({-1.2}) +_ = i1:delete({1.2}) +_ = i1:delete({bdbl}) +_ = i1:delete({nbdbl}) +i1:alter{parts = {2, 'integer'}} +_ = s:insert{15, -4} +_ = s:insert{16, 4} +i1:select() +_ = i1:delete({-5}) +_ = i1:delete({-4}) +_ = i1:delete({-2}) +_ = i1:delete({-1}) +i1:alter{parts = {2, 'unsigned'}} +_ = s:insert({17, 10}) +i1:alter{parts = {2, 'scalar'}} +_ = s:insert({18, 1.1}) +i1:select() +s:drop() + -- -- Creating/altering a secondary index of a non-empty space. -- -- 2.21.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 2/3] memtx: introduce tuple compare hint 2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov @ 2019-03-20 18:08 ` Vladimir Davydov 0 siblings, 0 replies; 11+ messages in thread From: Vladimir Davydov @ 2019-03-20 18:08 UTC (permalink / raw) To: Kirill Shcherbatov; +Cc: tarantool-patches Pushed to master with some changes. You will find the new patch below. I removed the tests, because I find it insufficient. Please cover all possible test cases and submit a separate patch. --- From 9fba29abb4e05babb9b23b4413bd8083f0fba933 Mon Sep 17 00:00:00 2001 Message-Id: <9fba29abb4e05babb9b23b4413bd8083f0fba933.1553105230.git.vdavydov.dev@gmail.com> From: Kirill Shcherbatov <kshcherbatov@tarantool.org> Date: Wed, 13 Mar 2019 13:06:30 +0300 Subject: [PATCH] memtx: introduce tuple compare hint Implement functions for retrieving tuple hints for a particular key_def. Hint is an integer that can be used for tuple comparison optimization: if a hint of one tuple is less than a hint of another then the first tuple is definitely less than the second; only if hints are equal tuple_compare must be called for getting comparison result. Hints are calculated using only the first part of key_def. @locker: - Rework key_def_set_hint_func. - Refactor functions calculating hints. - Drop has_collation template argument (we don't use templates for collations anywhere else). - Add part_count argument to key_hint (it's conventional to pass part_count along with decoded key). - Improve comments, rename a few functions, and cleanup code. Close #3961 --- src/box/key_def.h | 115 ++++++++++ src/box/memtx_tree.c | 40 +++- src/box/tuple_compare.cc | 535 +++++++++++++++++++++++++++++++++++++++++++++-- src/lib/coll/coll.c | 26 +++ src/lib/coll/coll.h | 12 ++ 5 files changed, 702 insertions(+), 26 deletions(-) diff --git a/src/box/key_def.h b/src/box/key_def.h index 7c9f8c6b..288cf727 100644 --- a/src/box/key_def.h +++ b/src/box/key_def.h @@ -117,6 +117,27 @@ struct key_def; struct tuple; /** + * Tuple comparison hint h(t) is such a function of tuple t that + * the following conditions always hold for any pair of tuples + * t1 and t2: + * + * if h(t1) < h(t2) then t1 < t2; + * if h(t1) > h(t2) then t1 > t2; + * if h(t1) == h(t2) then t1 may or may not be equal to t2. + * + * These rules mean that instead of direct tuple vs tuple + * (or tuple vs key) comparison one may compare their hints + * first and only if theirs hints equal compare the tuples + * themselves. + */ +typedef uint64_t hint_t; + +/** + * Reserved value to use when comparison hint is undefined. + */ +#define HINT_NONE ((hint_t)UINT64_MAX) + +/** * Get is_nullable property of key_part. * @param key_part for which attribute is being fetched * @@ -133,10 +154,23 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a, const char *key, uint32_t part_count, struct key_def *key_def); +/** @copydoc tuple_compare_with_key_hinted() */ +typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple, + hint_t tuple_hint, + const char *key, + uint32_t part_count, + hint_t key_hint, + struct key_def *key_def); /** @copydoc tuple_compare() */ typedef int (*tuple_compare_t)(const struct tuple *tuple_a, const struct tuple *tuple_b, struct key_def *key_def); +/** @copydoc tuple_compare_hinted() */ +typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a, + hint_t tuple_a_hint, + const struct tuple *tuple_b, + hint_t tuple_b_hint, + struct key_def *key_def); /** @copydoc tuple_extract_key() */ typedef char *(*tuple_extract_key_t)(const struct tuple *tuple, struct key_def *key_def, @@ -152,13 +186,23 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple, /** @copydoc key_hash() */ typedef uint32_t (*key_hash_t)(const char *key, struct key_def *key_def); +/** @copydoc tuple_hint() */ +typedef hint_t (*tuple_hint_t)(const struct tuple *tuple, + struct key_def *key_def); +/** @copydoc key_hint() */ +typedef hint_t (*key_hint_t)(const char *key, uint32_t part_count, + struct key_def *key_def); /* Definition of a multipart key. */ struct key_def { /** @see tuple_compare() */ tuple_compare_t tuple_compare; + /** @see tuple_compare_hinted() */ + tuple_compare_hinted_t tuple_compare_hinted; /** @see tuple_compare_with_key() */ tuple_compare_with_key_t tuple_compare_with_key; + /** @see tuple_compare_with_key_hinted() */ + tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted; /** @see tuple_extract_key() */ tuple_extract_key_t tuple_extract_key; /** @see tuple_extract_key_raw() */ @@ -167,6 +211,10 @@ struct key_def { tuple_hash_t tuple_hash; /** @see key_hash() */ key_hash_t key_hash; + /** @see tuple_hint() */ + tuple_hint_t tuple_hint; + /** @see key_hint() */ + key_hint_t key_hint; /** * Minimal part count which always is unique. For example, * if a secondary index is unique, then @@ -558,6 +606,26 @@ tuple_compare(const struct tuple *tuple_a, const struct tuple *tuple_b, } /** + * Compare tuples using the key definition and comparison hints. + * @param tuple_a first tuple + * @param tuple_a_hint comparison hint of @a tuple_a + * @param tuple_b second tuple + * @param tuple_b_hint comparison hint of @a tuple_b + * @param key_def key definition + * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b) + * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b) + * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b) + */ +static inline int +tuple_compare_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint, + const struct tuple *tuple_b, hint_t tuple_b_hint, + struct key_def *key_def) +{ + return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b, + tuple_b_hint, key_def); +} + +/** * @brief Compare tuple with key using the key definition. * @param tuple tuple * @param key key parts without MessagePack array header @@ -576,6 +644,29 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key, } /** + * Compare tuple with key using the key definition and + * comparison hints + * @param tuple tuple + * @param tuple_hint comparison hint of @a tuple + * @param key key parts without MessagePack array header + * @param part_count the number of parts in @a key + * @param key_hint comparison hint of @a key + * @param key_def key definition + * @retval 0 if key_fields(tuple) == parts(key) + * @retval <0 if key_fields(tuple) < parts(key) + * @retval >0 if key_fields(tuple) > parts(key) + */ +static inline int +tuple_compare_with_key_hinted(const struct tuple *tuple, hint_t tuple_hint, + const char *key, uint32_t part_count, + hint_t key_hint, struct key_def *key_def) +{ + return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key, + part_count, key_hint, + key_def); +} + +/** * Compute hash of a tuple field. * @param ph1 - pointer to running hash * @param pcarry - pointer to carry @@ -628,6 +719,30 @@ key_hash(const char *key, struct key_def *key_def) return key_def->key_hash(key, key_def); } + /* + * Get comparison hint for a tuple. + * @param tuple - tuple to compute the hint for + * @param key_def - key_def used for tuple comparison + * @return - hint value + */ +static inline hint_t +tuple_hint(const struct tuple *tuple, struct key_def *key_def) +{ + return key_def->tuple_hint(tuple, key_def); +} + +/** + * Get a comparison hint of a key. + * @param key - key to compute the hint for + * @param key_def - key_def used for tuple comparison + * @return - hint value + */ +static inline hint_t +key_hint(const char *key, uint32_t part_count, struct key_def *key_def) +{ + return key_def->key_hint(key, part_count, key_def); +} + #if defined(__cplusplus) } /* extern "C" */ #endif /* defined(__cplusplus) */ diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c index e24c4032..fe037c54 100644 --- a/src/box/memtx_tree.c +++ b/src/box/memtx_tree.c @@ -47,6 +47,8 @@ struct memtx_tree_key_data { const char *key; /** Number of msgpacked search fields. */ uint32_t part_count; + /** Comparison hint, see tuple_hint(). */ + hint_t hint; }; /** @@ -55,6 +57,8 @@ struct memtx_tree_key_data { struct memtx_tree_data { /* Tuple that this node is represents. */ struct tuple *tuple; + /** Comparison hint, see key_hint(). */ + hint_t hint; }; /** @@ -69,16 +73,18 @@ static bool memtx_tree_data_identical(const struct memtx_tree_data *a, const struct memtx_tree_data *b) { - return a->tuple == b->tuple; + return a->tuple == b->tuple && a->hint == b->hint; } #define BPS_TREE_NAME memtx_tree #define BPS_TREE_BLOCK_SIZE (512) #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE #define BPS_TREE_COMPARE(a, b, arg)\ - tuple_compare((&a)->tuple, (&b)->tuple, arg) + tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\ + (&b)->hint, arg) #define BPS_TREE_COMPARE_KEY(a, b, arg)\ - tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg) + tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\ + (b)->part_count, (b)->hint, arg) #define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b) #define bps_tree_elem_t struct memtx_tree_data #define bps_tree_key_t struct memtx_tree_key_data * @@ -119,7 +125,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c) const struct memtx_tree_data *data_a = a; const struct memtx_tree_data *data_b = b; struct key_def *key_def = c; - return tuple_compare(data_a->tuple, data_b->tuple, key_def); + return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple, + data_b->hint, key_def); } /* {{{ MemtxTree Iterators ****************************************/ @@ -241,9 +248,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret) memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator); /* Use user key def to save a few loops. */ if (res == NULL || - tuple_compare_with_key(res->tuple, it->key_data.key, - it->key_data.part_count, - it->index_def->key_def) != 0) { + tuple_compare_with_key_hinted(res->tuple, res->hint, + it->key_data.key, + it->key_data.part_count, + it->key_data.hint, + it->index_def->key_def) != 0) { iterator->next = tree_iterator_dummie; it->current.tuple = NULL; *ret = NULL; @@ -272,9 +281,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret) memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator); /* Use user key def to save a few loops. */ if (res == NULL || - tuple_compare_with_key(res->tuple, it->key_data.key, - it->key_data.part_count, - it->index_def->key_def) != 0) { + tuple_compare_with_key_hinted(res->tuple, res->hint, + it->key_data.key, + it->key_data.part_count, + it->key_data.hint, + it->index_def->key_def) != 0) { iterator->next = tree_iterator_dummie; it->current.tuple = NULL; *ret = NULL; @@ -513,9 +524,11 @@ memtx_tree_index_get(struct index *base, const char *key, assert(base->def->opts.is_unique && part_count == base->def->key_def->part_count); struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); struct memtx_tree_key_data key_data; key_data.key = key; key_data.part_count = part_count; + key_data.hint = key_hint(key, part_count, cmp_def); struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data); *result = res != NULL ? res->tuple : NULL; return 0; @@ -527,9 +540,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, struct tuple **result) { struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); if (new_tuple) { struct memtx_tree_data new_data; new_data.tuple = new_tuple; + new_data.hint = tuple_hint(new_tuple, cmp_def); struct memtx_tree_data dup_data; dup_data.tuple = NULL; @@ -562,6 +577,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, if (old_tuple) { struct memtx_tree_data old_data; old_data.tuple = old_tuple; + old_data.hint = tuple_hint(old_tuple, cmp_def); memtx_tree_delete(&index->tree, old_data); } *result = old_tuple; @@ -574,6 +590,7 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type, { struct memtx_tree_index *index = (struct memtx_tree_index *)base; struct memtx_engine *memtx = (struct memtx_engine *)base->engine; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); assert(part_count == 0 || key != NULL); if (type > ITER_GT) { @@ -604,6 +621,7 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type, it->type = type; it->key_data.key = key; it->key_data.part_count = part_count; + it->key_data.hint = key_hint(key, part_count, cmp_def); it->index_def = base->def; it->tree = &index->tree; it->tree_iterator = memtx_tree_invalid_iterator(); @@ -641,6 +659,7 @@ static int memtx_tree_index_build_next(struct index *base, struct tuple *tuple) { struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); if (index->build_array == NULL) { index->build_array = malloc(MEMTX_EXTENT_SIZE); if (index->build_array == NULL) { @@ -668,6 +687,7 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple) struct memtx_tree_data *elem = &index->build_array[index->build_array_size++]; elem->tuple = tuple; + elem->hint = tuple_hint(tuple, cmp_def); return 0; } diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index 595f6f25..93756365 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -36,6 +36,25 @@ /* {{{ tuple_compare */ +/** + * Compare two tuple hints. + * + * Returns: + * + * -1 if the first tuple is less than the second tuple + * +1 if the first tuple is greater than the second tuple + * 0 if the first tuple may be less than, equal to, or + * greater than the second tuple, and a full tuple + * comparison is needed to determine the order. + */ +static inline int +hint_cmp(hint_t hint_a, hint_t hint_b) +{ + if (hint_a != HINT_NONE && hint_b != HINT_NONE && hint_a != hint_b) + return hint_a < hint_b ? -1 : 1; + return 0; +} + /* * Compare two tuple fields. * Separate version exists since compare is a very @@ -53,7 +72,8 @@ enum mp_class { MP_CLASS_STR, MP_CLASS_BIN, MP_CLASS_ARRAY, - MP_CLASS_MAP + MP_CLASS_MAP, + mp_class_max, }; static enum mp_class mp_classes[] = { @@ -427,13 +447,17 @@ tuple_compare_field_with_type(const char *field_a, enum mp_type a_type, template<bool is_nullable, bool has_optional_parts, bool has_json_paths> static inline int -tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, - struct key_def *key_def) +tuple_compare_slowpath_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint, + const struct tuple *tuple_b, hint_t tuple_b_hint, + struct key_def *key_def) { assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || is_nullable); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; struct key_part *part = key_def->parts; const char *tuple_a_raw = tuple_data(tuple_a); const char *tuple_b_raw = tuple_data(tuple_b); @@ -469,7 +493,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, struct key_part *end; const char *field_a, *field_b; enum mp_type a_type, b_type; - int rc; if (is_nullable) end = part + key_def->unique_part_count; else @@ -559,8 +582,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, template<bool is_nullable, bool has_optional_parts, bool has_json_paths> static inline int -tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, - uint32_t part_count, struct key_def *key_def) +tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, + struct key_def *key_def) +{ + return tuple_compare_slowpath_hinted + <is_nullable, has_optional_parts, has_json_paths> + (tuple_a, HINT_NONE, tuple_b, HINT_NONE, key_def); +} + +template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +static inline int +tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple, + hint_t tuple_hint, const char *key, uint32_t part_count, + hint_t key_hint, struct key_def *key_def) { assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || is_nullable); @@ -568,6 +602,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, assert(has_optional_parts == key_def->has_optional_parts); assert(key != NULL || part_count == 0); assert(part_count <= key_def->part_count); + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; struct key_part *part = key_def->parts; struct tuple_format *format = tuple_format(tuple); const char *tuple_raw = tuple_data(tuple); @@ -603,7 +640,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, } struct key_part *end = part + part_count; - int rc; for (; part < end; ++part, mp_next(&key)) { const char *field; if (has_json_paths) { @@ -642,6 +678,16 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, return 0; } +template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +static inline int +tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, + uint32_t part_count, struct key_def *key_def) +{ + return tuple_compare_with_key_slowpath_hinted + <is_nullable, has_optional_parts, has_json_paths> + (tuple, HINT_NONE, key, part_count, HINT_NONE, key_def); +} + template<bool is_nullable> static inline int key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count, @@ -700,13 +746,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count, template<bool is_nullable, bool has_optional_parts> static inline int -tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, - uint32_t part_count, struct key_def *key_def) +tuple_compare_with_key_sequential_hinted(const struct tuple *tuple, + hint_t tuple_hint, const char *key, uint32_t part_count, + hint_t key_hint, struct key_def *key_def) { assert(!has_optional_parts || is_nullable); assert(key_def_is_sequential(key_def)); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; const char *tuple_key = tuple_data(tuple); uint32_t field_count = mp_decode_array(&tuple_key); uint32_t cmp_part_count; @@ -716,8 +766,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, assert(field_count >= part_count); cmp_part_count = part_count; } - int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count, - key_def); + rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count, + key_def); if (!has_optional_parts || rc != 0) return rc; /* @@ -739,6 +789,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, return 0; } +template<bool is_nullable, bool has_optional_parts> +static inline int +tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key, + uint32_t part_count, struct key_def *key_def) +{ + return tuple_compare_with_key_sequential_hinted + <is_nullable, has_optional_parts> + (tuple, HINT_NONE, key, part_count, HINT_NONE, key_def); +} + int key_compare(const char *key_a, const char *key_b, struct key_def *key_def) { @@ -759,13 +819,17 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def) template <bool is_nullable, bool has_optional_parts> static int -tuple_compare_sequential(const struct tuple *tuple_a, - const struct tuple *tuple_b, key_def *key_def) +tuple_compare_sequential_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint, + const struct tuple *tuple_b, hint_t tuple_b_hint, + struct key_def *key_def) { assert(!has_optional_parts || is_nullable); assert(has_optional_parts == key_def->has_optional_parts); assert(key_def_is_sequential(key_def)); assert(is_nullable == key_def->is_nullable); + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; const char *key_a = tuple_data(tuple_a); uint32_t fc_a = mp_decode_array(&key_a); const char *key_b = tuple_data(tuple_b); @@ -779,7 +843,6 @@ tuple_compare_sequential(const struct tuple *tuple_a, bool was_null_met = false; struct key_part *part = key_def->parts; struct key_part *end = part + key_def->unique_part_count; - int rc; uint32_t i = 0; for (; part < end; ++part, ++i) { enum mp_type a_type, b_type; @@ -826,6 +889,16 @@ tuple_compare_sequential(const struct tuple *tuple_a, return 0; } +template <bool is_nullable, bool has_optional_parts> +static int +tuple_compare_sequential(const struct tuple *tuple_a, + const struct tuple *tuple_b, struct key_def *key_def) +{ + return tuple_compare_sequential_hinted + <is_nullable, has_optional_parts> + (tuple_a, HINT_NONE, tuple_b, HINT_NONE, key_def); +} + template <int TYPE> static inline int field_compare(const char **field_a, const char **field_b); @@ -956,6 +1029,18 @@ struct TupleCompare compare(tuple_a, tuple_b, format_a, format_b, field_a, field_b); } + + static int compare_hinted(const struct tuple *tuple_a, + hint_t tuple_a_hint, + const struct tuple *tuple_b, + hint_t tuple_b_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; + return compare(tuple_a, tuple_b, key_def); + } }; template <int TYPE, int ...MORE_TYPES> @@ -973,15 +1058,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> { return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b, format_a, format_b, field_a, field_b); } + + static int compare_hinted(const struct tuple *tuple_a, + hint_t tuple_a_hint, + const struct tuple *tuple_b, + hint_t tuple_b_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_a_hint, tuple_b_hint); + if (rc != 0) + return rc; + return compare(tuple_a, tuple_b, key_def); + } }; } /* end of anonymous namespace */ struct comparator_signature { tuple_compare_t f; + tuple_compare_hinted_t f_hinted; uint32_t p[64]; }; #define COMPARATOR(...) \ - { TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } }, + { TupleCompare<__VA_ARGS__>::compare, \ + TupleCompare<__VA_ARGS__>::compare_hinted, \ + { __VA_ARGS__, UINT32_MAX } }, /** * field1 no, field1 type, field2 no, field2 type, ... @@ -1133,6 +1233,17 @@ struct TupleCompareWithKey compare(tuple, key, part_count, key_def, format, field); } + + static int + compare_hinted(const struct tuple *tuple, hint_t tuple_hint, + const char *key, uint32_t part_count, hint_t key_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; + return compare(tuple, key, part_count, key_def); + } }; template <int TYPE, int ...MORE_TYPES> @@ -1153,6 +1264,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...> compare(tuple, key, part_count, key_def, format, field); } + + static int + compare_hinted(const struct tuple *tuple, hint_t tuple_hint, + const char *key, uint32_t part_count, hint_t key_hint, + struct key_def *key_def) + { + int rc = hint_cmp(tuple_hint, key_hint); + if (rc != 0) + return rc; + return compare(tuple, key, part_count, key_def); + } }; } /* end of anonymous namespace */ @@ -1160,11 +1282,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...> struct comparator_with_key_signature { tuple_compare_with_key_t f; + tuple_compare_with_key_hinted_t f_hinted; uint32_t p[64]; }; #define KEY_COMPARATOR(...) \ - { TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } }, + { TupleCompareWithKey<0, __VA_ARGS__>::compare, \ + TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \ + { __VA_ARGS__ } }, static const comparator_with_key_signature cmp_wk_arr[] = { KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED) @@ -1186,6 +1311,356 @@ static const comparator_with_key_signature cmp_wk_arr[] = { /* }}} tuple_compare_with_key */ +/* {{{ tuple_hint */ + +/** + * A comparison hint is an unsigned integer number that has + * the following layout: + * + * [ class | value ] + * <-- HINT_CLASS_BITS --> <-- HINT_VALUE_BITS --> + * <----------------- HINT_BITS -----------------> + * + * For simplicity we construct it using the first key part only; + * other key parts don't participate in hint construction. As a + * consequence, tuple hints are useless if the first key part + * doesn't differ among indexed tuples. + * + * Hint class stores one of mp_class enum values corresponding + * to the field type. We store it in upper bits of a hint so + * as to guarantee that tuple fields that have different types + * are compared correctly in a scalar index. We use the same + * layout for indexes of base types too so that we don't need + * to recalculate hints when altering an index from scalar to + * one of base types or vice versa without index rebuild. + * + * Hint value is stored in lower bits of a hint and chosen so + * as to satisfy the hint property (see the comment to hint_t). + * Also, hint values must be compatible among all kinds of numbers + * (integer, unsigned, floating point) so that we can alter an + * index between any two of them without touching the hints. + * + * The value depends on the field type: + * + * - For an integer field, the hint value equals the number + * stored in the field minus the minimal (negative) integer + * number that fits in a hint. If the number is too large + * or too small to fit in a hint, we use the max or the min + * number that fits. This guarantees correct order for all + * positive and negative integers. + * + * - For a field storing a floating point number, we use + * the hint computed for the integral part. This ensures + * compatibility between floating point and integer field + * hints. + * + * - For a boolean field, the hint value is simply 0 or 1. + * + * - For a string field, we take the first few characters that + * fit in a hint and shift them in such a way that comparing + * them is equivalent to strcmp(). If there's a collation, + * we use the sort key instead of the string (see coll->hint). + * + * - For a field containing NULL, the value is 0, and we rely on + * mp_class comparison rules for arranging nullable fields. + */ +#define HINT_BITS (sizeof(hint_t) * CHAR_BIT) +#define HINT_CLASS_BITS 4 +#define HINT_VALUE_BITS (HINT_BITS - HINT_CLASS_BITS) + +/** Number of bytes that fit in a hint value. */ +#define HINT_VALUE_BYTES (HINT_VALUE_BITS / CHAR_BIT) + +/** Max unsigned integer that can be stored in a hint value. */ +#define HINT_VALUE_MAX ((1ULL << HINT_VALUE_BITS) - 1) + +/** + * Max and min signed integer numbers that fit in a hint value. + * For numbers > MAX and < MIN we store MAX and MIN, respectively. + */ +#define HINT_VALUE_INT_MAX ((1LL << (HINT_VALUE_BITS - 1)) - 1) +#define HINT_VALUE_INT_MIN (-(1LL << (HINT_VALUE_BITS - 1))) + +/** + * Max and min floating point numbers whose integral parts fit + * in a hint value. Note, we can't compare a floating point number + * with HINT_VALUE_INT_{MIN,MAX} because of rounding errors. + */ +#define HINT_VALUE_DOUBLE_MAX (exp2(HINT_VALUE_BITS - 1) - 1) +#define HINT_VALUE_DOUBLE_MIN (-exp2(HINT_VALUE_BITS - 1)) + +/* + * HINT_CLASS_BITS should be big enough to store any mp_class value. + * Note, ((1 << HINT_CLASS_BITS) - 1) is reserved for HINT_NONE. + */ +static_assert(mp_class_max < (1 << HINT_CLASS_BITS) - 1, + "mp_class must fit in tuple hint"); + +static inline hint_t +hint_create(enum mp_class c, uint64_t val) +{ + assert((val >> HINT_VALUE_BITS) == 0); + return (hint_t)(((uint64_t)c << HINT_VALUE_BITS) | val); +} + +static inline hint_t +hint_nil(void) +{ + return hint_create(MP_CLASS_NIL, 0); +} + +static inline hint_t +hint_bool(bool b) +{ + return hint_create(MP_CLASS_BOOL, b ? 1 : 0); +} + +static inline hint_t +hint_uint(uint64_t u) +{ + uint64_t val = (u >= (uint64_t)HINT_VALUE_INT_MAX ? + HINT_VALUE_MAX : u - HINT_VALUE_INT_MIN); + return hint_create(MP_CLASS_NUMBER, val); +} + +static inline hint_t +hint_int(int64_t i) +{ + assert(i < 0); + uint64_t val = (i <= HINT_VALUE_INT_MIN ? 0 : i - HINT_VALUE_INT_MIN); + return hint_create(MP_CLASS_NUMBER, val); +} + +static inline hint_t +hint_double(double d) +{ + if (!isfinite(d)) + return HINT_NONE; + + uint64_t val; + if (d >= HINT_VALUE_DOUBLE_MAX) + val = HINT_VALUE_MAX; + else if (d <= HINT_VALUE_DOUBLE_MIN) + val = 0; + else + val = (int64_t)d - HINT_VALUE_INT_MIN; + + return hint_create(MP_CLASS_NUMBER, val); +} + +static inline uint64_t +hint_str_raw(const char *s, uint32_t len) +{ + len = MIN(len, HINT_VALUE_BYTES); + uint64_t val = 0; + for (uint32_t i = 0; i < len; i++) { + val <<= CHAR_BIT; + val |= (unsigned char)s[i]; + } + val <<= CHAR_BIT * (HINT_VALUE_BYTES - len); + return val; +} + +static inline hint_t +hint_str(const char *s, uint32_t len) +{ + uint64_t val = hint_str_raw(s, len); + return hint_create(MP_CLASS_STR, val); +} + +static inline hint_t +hint_str_coll(const char *s, uint32_t len, struct coll *coll) +{ + char buf[HINT_VALUE_BYTES]; + uint32_t buf_len = coll->hint(s, len, buf, sizeof(buf), coll); + uint64_t val = hint_str_raw(buf, buf_len); + return hint_create(MP_CLASS_STR, val); +} + +static inline hint_t +hint_bin(const char *s, uint32_t len) +{ + uint64_t val = hint_str_raw(s, len); + return hint_create(MP_CLASS_BIN, val); +} + +static inline hint_t +field_hint_boolean(const char *field) +{ + assert(mp_typeof(*field) == MP_BOOL); + return hint_bool(mp_decode_bool(&field)); +} + +static inline hint_t +field_hint_unsigned(const char *field) +{ + assert(mp_typeof(*field) == MP_UINT); + return hint_uint(mp_decode_uint(&field)); +} + +static inline hint_t +field_hint_integer(const char *field) +{ + switch (mp_typeof(*field)) { + case MP_UINT: + return hint_uint(mp_decode_uint(&field)); + case MP_INT: + return hint_int(mp_decode_int(&field)); + default: + unreachable(); + } + return HINT_NONE; +} + +static inline hint_t +field_hint_number(const char *field) +{ + switch (mp_typeof(*field)) { + case MP_UINT: + return hint_uint(mp_decode_uint(&field)); + case MP_INT: + return hint_int(mp_decode_int(&field)); + case MP_FLOAT: + return hint_double(mp_decode_float(&field)); + case MP_DOUBLE: + return hint_double(mp_decode_double(&field)); + default: + unreachable(); + } + return HINT_NONE; +} + +static inline hint_t +field_hint_string(const char *field, struct coll *coll) +{ + assert(mp_typeof(*field) == MP_STR); + uint32_t len = mp_decode_strl(&field); + return coll == NULL ? hint_str(field, len) : + hint_str_coll(field, len, coll); +} + +static inline hint_t +field_hint_scalar(const char *field, struct coll *coll) +{ + uint32_t len; + switch(mp_typeof(*field)) { + case MP_BOOL: + return hint_bool(mp_decode_bool(&field)); + case MP_UINT: + return hint_uint(mp_decode_uint(&field)); + case MP_INT: + return hint_int(mp_decode_int(&field)); + case MP_FLOAT: + return hint_double(mp_decode_float(&field)); + case MP_DOUBLE: + return hint_double(mp_decode_double(&field)); + case MP_STR: + len = mp_decode_strl(&field); + return coll == NULL ? hint_str(field, len) : + hint_str_coll(field, len, coll); + case MP_BIN: + len = mp_decode_binl(&field); + return hint_bin(field, len); + default: + unreachable(); + } + return HINT_NONE; +} + +template <enum field_type type, bool is_nullable> +static inline hint_t +field_hint(const char *field, struct coll *coll) +{ + if (is_nullable && mp_typeof(*field) == MP_NIL) + return hint_nil(); + switch (type) { + case FIELD_TYPE_BOOLEAN: + return field_hint_boolean(field); + case FIELD_TYPE_UNSIGNED: + return field_hint_unsigned(field); + case FIELD_TYPE_INTEGER: + return field_hint_integer(field); + case FIELD_TYPE_NUMBER: + return field_hint_number(field); + case FIELD_TYPE_STRING: + return field_hint_string(field, coll); + case FIELD_TYPE_SCALAR: + return field_hint_scalar(field, coll); + default: + unreachable(); + } + return HINT_NONE; +} + +template <enum field_type type, bool is_nullable> +static hint_t +key_hint(const char *key, uint32_t part_count, struct key_def *key_def) +{ + if (part_count == 0) + return HINT_NONE; + return field_hint<type, is_nullable>(key, key_def->parts->coll); +} + +template <enum field_type type, bool is_nullable> +static hint_t +tuple_hint(const struct tuple *tuple, struct key_def *key_def) +{ + const char *field = tuple_field_by_part(tuple, key_def->parts); + if (is_nullable && field == NULL) + return hint_nil(); + return field_hint<type, is_nullable>(field, key_def->parts->coll); +} + +template<enum field_type type, bool is_nullable> +static void +key_def_set_hint_func(struct key_def *def) +{ + def->key_hint = key_hint<type, is_nullable>; + def->tuple_hint = tuple_hint<type, is_nullable>; +} + +template<enum field_type type> +static void +key_def_set_hint_func(struct key_def *def) +{ + if (key_part_is_nullable(def->parts)) + key_def_set_hint_func<type, true>(def); + else + key_def_set_hint_func<type, false>(def); +} + +static void +key_def_set_hint_func(struct key_def *def) +{ + switch (def->parts->type) { + case FIELD_TYPE_BOOLEAN: + key_def_set_hint_func<FIELD_TYPE_BOOLEAN>(def); + break; + case FIELD_TYPE_UNSIGNED: + key_def_set_hint_func<FIELD_TYPE_UNSIGNED>(def); + break; + case FIELD_TYPE_INTEGER: + key_def_set_hint_func<FIELD_TYPE_INTEGER>(def); + break; + case FIELD_TYPE_NUMBER: + key_def_set_hint_func<FIELD_TYPE_NUMBER>(def); + break; + case FIELD_TYPE_STRING: + key_def_set_hint_func<FIELD_TYPE_STRING>(def); + break; + case FIELD_TYPE_SCALAR: + key_def_set_hint_func<FIELD_TYPE_SCALAR>(def); + break; + default: + /* Invalid key definition. */ + def->key_hint = NULL; + def->tuple_hint = NULL; + break; + } +} + +/* }}} tuple_hint */ + static void key_def_set_compare_func_fast(struct key_def *def) { @@ -1195,7 +1670,9 @@ key_def_set_compare_func_fast(struct key_def *def) assert(!key_def_has_collation(def)); tuple_compare_t cmp = NULL; + tuple_compare_hinted_t cmp_hinted = NULL; tuple_compare_with_key_t cmp_wk = NULL; + tuple_compare_with_key_hinted_t cmp_wk_hinted = NULL; bool is_sequential = key_def_is_sequential(def); /* @@ -1210,6 +1687,7 @@ key_def_set_compare_func_fast(struct key_def *def) break; if (i == def->part_count && cmp_arr[k].p[i * 2] == UINT32_MAX) { cmp = cmp_arr[k].f; + cmp_hinted = cmp_arr[k].f_hinted; break; } } @@ -1222,6 +1700,7 @@ key_def_set_compare_func_fast(struct key_def *def) } if (i == def->part_count) { cmp_wk = cmp_wk_arr[k].f; + cmp_wk_hinted = cmp_wk_arr[k].f_hinted; break; } } @@ -1229,15 +1708,23 @@ key_def_set_compare_func_fast(struct key_def *def) cmp = is_sequential ? tuple_compare_sequential<false, false> : tuple_compare_slowpath<false, false, false>; + cmp_hinted = is_sequential ? + tuple_compare_sequential_hinted<false, false> : + tuple_compare_slowpath_hinted<false, false, false>; } if (cmp_wk == NULL) { cmp_wk = is_sequential ? tuple_compare_with_key_sequential<false, false> : tuple_compare_with_key_slowpath<false, false, false>; + cmp_wk_hinted = is_sequential ? + tuple_compare_with_key_sequential_hinted<false, false> : + tuple_compare_with_key_slowpath_hinted<false, false, false>; } def->tuple_compare = cmp; + def->tuple_compare_hinted = cmp_hinted; def->tuple_compare_with_key = cmp_wk; + def->tuple_compare_with_key_hinted = cmp_wk_hinted; } template<bool is_nullable, bool has_optional_parts> @@ -1248,13 +1735,23 @@ key_def_set_compare_func_plain(struct key_def *def) if (key_def_is_sequential(def)) { def->tuple_compare = tuple_compare_sequential <is_nullable, has_optional_parts>; + def->tuple_compare_hinted = tuple_compare_sequential_hinted + <is_nullable, has_optional_parts>; def->tuple_compare_with_key = tuple_compare_with_key_sequential <is_nullable, has_optional_parts>; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_sequential_hinted + <is_nullable, has_optional_parts>; } else { def->tuple_compare = tuple_compare_slowpath <is_nullable, has_optional_parts, false>; + def->tuple_compare_hinted = tuple_compare_slowpath_hinted + <is_nullable, has_optional_parts, false>; def->tuple_compare_with_key = tuple_compare_with_key_slowpath <is_nullable, has_optional_parts, false>; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_slowpath_hinted + <is_nullable, has_optional_parts, false>; } } @@ -1265,8 +1762,13 @@ key_def_set_compare_func_json(struct key_def *def) assert(def->has_json_paths); def->tuple_compare = tuple_compare_slowpath <is_nullable, has_optional_parts, true>; + def->tuple_compare_hinted = tuple_compare_slowpath_hinted + <is_nullable, has_optional_parts, true>; def->tuple_compare_with_key = tuple_compare_with_key_slowpath <is_nullable, has_optional_parts, true>; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_slowpath_hinted + <is_nullable, has_optional_parts, true>; } void @@ -1294,4 +1796,5 @@ key_def_set_compare_func(struct key_def *def) key_def_set_compare_func_json<false, false>(def); } } + key_def_set_hint_func(def); } diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c index afc15e80..b83f0fdc 100644 --- a/src/lib/coll/coll.c +++ b/src/lib/coll/coll.c @@ -133,6 +133,30 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry, return s_len; } +static size_t +coll_icu_hint(const char *s, size_t s_len, char *buf, size_t buf_len, + struct coll *coll) +{ + assert(coll->type == COLL_TYPE_ICU); + UCharIterator itr; + uiter_setUTF8(&itr, s, s_len); + uint32_t state[2] = {0, 0}; + UErrorCode status = U_ZERO_ERROR; + return ucol_nextSortKeyPart(coll->collator, &itr, state, + (uint8_t *)buf, buf_len, &status); +} + +static size_t +coll_bin_hint(const char *s, size_t s_len, char *buf, size_t buf_len, + struct coll *coll) +{ + (void)coll; + assert(coll->type == COLL_TYPE_BINARY); + size_t len = MIN(s_len, buf_len); + memcpy(buf, s, len); + return len; +} + /** * Set up ICU collator and init cmp and hash members of collation. * @param coll Collation to set up. @@ -242,6 +266,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def) } coll->cmp = coll_icu_cmp; coll->hash = coll_icu_hash; + coll->hint = coll_icu_hint; return 0; } @@ -356,6 +381,7 @@ coll_new(const struct coll_def *def) coll->collator = NULL; coll->cmp = coll_bin_cmp; coll->hash = coll_bin_hash; + coll->hint = coll_bin_hint; break; default: unreachable(); diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h index 8c9f9429..c505804f 100644 --- a/src/lib/coll/coll.h +++ b/src/lib/coll/coll.h @@ -48,6 +48,9 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t, typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry, struct coll *coll); +typedef size_t (*coll_hint_f)(const char *s, size_t s_len, char *buf, + size_t buf_len, struct coll *coll); + struct UCollator; /** @@ -61,7 +64,16 @@ struct coll { struct UCollator *collator; /** String comparator. */ coll_cmp_f cmp; + /** String hash function. */ coll_hash_f hash; + /** + * String comparison hint. + * + * This function copies a sort key for a string to + * the given buffer and returns the number of bytes + * copied. Sort keys may be compared using strcmp(). + */ + coll_hint_f hint; /** Reference counter. */ int refs; /** ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 3/3] box: introduce multikey indexes 2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov 2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov 2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov @ 2019-03-13 12:15 ` Kirill Shcherbatov 2 siblings, 0 replies; 11+ messages in thread From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw) To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov I didn't finished review fixes for multikey indexes, but I've included this part to show how the modified hints changes affect multikey indexes. ========================================================================= With multikey index feature introduction, JSON path may have [*] placeholder instead of array index: this allows to index multiple documents by one JSON path automatically. Example: s = box.schema.space.create('withdata') idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) _ = s:insert({1, 1, {{fname='James', sname='Bond'}, {fname='Austin', sname='Powers'}}}) idx:get({'Austin', 'Powers'}) --- - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Austin', 'sname': 'Powers'}]] ... Closes #1257 --- src/box/key_def.c | 15 ++++ src/box/key_def.h | 5 ++ src/box/memtx_tree.c | 175 +++++++++++++++++++++++++++++++++++++- src/box/tuple.c | 8 +- src/box/tuple.h | 122 +++++++++++++++++++++++--- src/box/tuple_compare.cc | 115 +++++++++++++++++++------ src/box/tuple_format.c | 120 +++++++++++++++++++++----- test/engine/json.result | 80 ++++++++++++++++- test/engine/json.test.lua | 20 +++++ 9 files changed, 592 insertions(+), 68 deletions(-) diff --git a/src/box/key_def.c b/src/box/key_def.c index d4c979aa1..b9cc77699 100644 --- a/src/box/key_def.c +++ b/src/box/key_def.c @@ -164,9 +164,24 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno, *path_pool += path_len; memcpy(def->parts[part_no].path, path, path_len); def->parts[part_no].path_len = path_len; + + int rc; + struct json_lexer lexer; + struct json_token token; + json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE); + while ((rc = json_lexer_next_token(&lexer, &token)) == 0) { + if (token.type == JSON_TOKEN_ANY) { + def->has_multikey_parts = true; + def->parts[part_no].is_multikey = true; + break; + } else if (token.type == JSON_TOKEN_END) { + break; + } + } } else { def->parts[part_no].path = NULL; def->parts[part_no].path_len = 0; + def->parts[part_no].is_multikey = false; } column_mask_set_fieldno(&def->column_mask, fieldno); } diff --git a/src/box/key_def.h b/src/box/key_def.h index 0d8f3112e..6a914df11 100644 --- a/src/box/key_def.h +++ b/src/box/key_def.h @@ -97,6 +97,8 @@ struct key_part { char *path; /** The length of JSON path. */ uint32_t path_len; + /** True if this is multikey key part. */ + bool is_multikey; /** * Epoch of the tuple format the offset slot cached in * this part is valid for, see tuple_format::epoch. @@ -205,6 +207,8 @@ struct key_def { bool is_nullable; /** True if some key part has JSON path. */ bool has_json_paths; + /** True if some part has array index placeholder *. */ + bool has_multikey_parts; /** * True, if some key parts can be absent in a tuple. These * fields assumed to be MP_NIL. @@ -699,6 +703,7 @@ key_hash(const char *key, struct key_def *key_def) * Get a comparison hint for a @a tuple. * @param tuple - tuple to get uint64_t of. * @param key_def - key_def that defines which comparison is used. + * @param multikey_idx - index of multikey array item. * @return the comparison auxiliary information. */ static inline uint64_t diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c index 2c1933ceb..bfc618bdf 100644 --- a/src/box/memtx_tree.c +++ b/src/box/memtx_tree.c @@ -537,7 +537,8 @@ memtx_tree_index_get(struct index *base, const char *key, struct memtx_tree_key_data key_data; key_data.key = key; key_data.part_count = part_count; - key_data.hint = key_hint(key, cmp_def); + if (!cmp_def->has_multikey_parts) + key_data.hint = key_hint(key, cmp_def); struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data); *result = res != NULL ? res->tuple : NULL; return 0; @@ -593,6 +594,98 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, return 0; } +static int +multikey_get_array_sz(struct tuple *tuple, struct key_def *key_def) +{ + assert(key_def->has_multikey_parts); + struct key_part *part = key_def->parts; + /* FIXME: don't like it... */ + while (part < key_def->parts + key_def->part_count && + !part->is_multikey) + part++; + assert(part->path != NULL && part->is_multikey); + struct tuple_field *field = + tuple_format_field_by_path(tuple_format(tuple), part->fieldno, + part->path, part->path_len); + assert(field != NULL); + struct field_map_ext *field_map_ext = + tuple_field_map_ext((uint32_t *)tuple_field_map(tuple), + field->offset_slot); + return field_map_ext->size; +} + +int +memtx_multikey_tree_index_replace(struct index *base, struct tuple *old_tuple, + struct tuple *new_tuple, + enum dup_replace_mode mode, + struct tuple **result) +{ + struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *key_def = index->tree.arg; + if (new_tuple != NULL) { + int errcode = 0, tree_res = 0; + struct tuple *dup_tuple = NULL; + int multikey_idx = 0; + int sz = multikey_get_array_sz(new_tuple, key_def); + for (; multikey_idx < sz; multikey_idx++) { + struct memtx_tree_data new_data; + new_data.tuple = new_tuple; + new_data.hint = multikey_idx; + struct memtx_tree_data dup_data; + dup_data.tuple = NULL; + tree_res = memtx_tree_insert(&index->tree, new_data, + &dup_data); + if (tree_res != 0) { + diag_set(OutOfMemory, MEMTX_EXTENT_SIZE, + "memtx_tree_index", "replace"); + break; + } + dup_tuple = dup_data.tuple; + errcode = replace_check_dup(old_tuple, dup_tuple, mode); + if (errcode != 0) { + memtx_tree_delete(&index->tree, new_data); + if (dup_tuple != NULL) { + memtx_tree_insert(&index->tree, + dup_data, NULL); + } + struct space *sp = + space_cache_find(base->def->space_id); + if (sp != NULL) { + diag_set(ClientError, errcode, + base->def->name, + space_name(sp)); + } + break; + } + } + if (tree_res != 0 || errcode != 0) { + multikey_idx--; + for (; multikey_idx >= 0; multikey_idx--) { + struct memtx_tree_data data; + data.tuple = new_tuple; + data.hint = multikey_idx; + memtx_tree_delete(&index->tree, data); + } + return -1; + } + if (dup_tuple != NULL) { + *result = dup_tuple; + return 0; + } + } + if (old_tuple != NULL) { + int sz = multikey_get_array_sz(old_tuple, key_def); + for (int multikey_idx = 0; multikey_idx < sz; multikey_idx++) { + struct memtx_tree_data old_data; + old_data.tuple = old_tuple; + old_data.hint = multikey_idx; + memtx_tree_delete(&index->tree, old_data); + } + } + *result = old_tuple; + return 0; +} + static struct iterator * memtx_tree_index_create_iterator(struct index *base, enum iterator_type type, const char *key, uint32_t part_count) @@ -630,7 +723,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type, it->key_data.key = key; it->key_data.part_count = part_count; struct key_def *cmp_def = memtx_tree_index_cmp_def(index); - it->key_data.hint = key_hint(key, cmp_def); + if (!cmp_def->has_multikey_parts) + it->key_data.hint = key_hint(key, cmp_def); it->index_def = base->def; it->tree = &index->tree; it->tree_iterator = memtx_tree_invalid_iterator(); @@ -700,6 +794,48 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple) return 0; } +static int +memtx_multikey_tree_index_build_next(struct index *base, struct tuple *tuple) +{ + struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *key_def = index->tree.arg; + int sz = multikey_get_array_sz(tuple, key_def); + + if (index->build_array == NULL) { + index->build_array = + (struct memtx_tree_data *)malloc(MEMTX_EXTENT_SIZE); + if (index->build_array == NULL) { + diag_set(OutOfMemory, MEMTX_EXTENT_SIZE, + "memtx_tree_index", "build_next"); + return -1; + } + index->build_array_alloc_size = + MEMTX_EXTENT_SIZE / sizeof(index->build_array[0]); + } + assert(index->build_array_size <= index->build_array_alloc_size); + if (index->build_array_size == index->build_array_alloc_size) { + index->build_array_alloc_size = index->build_array_alloc_size + + index->build_array_alloc_size / 2; + struct memtx_tree_data *tmp = + realloc(index->build_array, + index->build_array_alloc_size * sizeof(*tmp)); + if (tmp == NULL) { + diag_set(OutOfMemory, index->build_array_alloc_size * + sizeof(*tmp), "memtx_tree_index", "build_next"); + return -1; + } + index->build_array = tmp; + } + for (int multikey_idx = 0; multikey_idx < sz; multikey_idx++) { + struct memtx_tree_data *elem = + &index->build_array[index->build_array_size++]; + assert(index->build_array_size < index->build_array_alloc_size); + elem->tuple = tuple; + elem->hint = multikey_idx; + } + return 0; +} + static void memtx_tree_index_end_build(struct index *base) { @@ -802,6 +938,36 @@ static const struct index_vtab memtx_tree_index_vtab = { /* .end_build = */ memtx_tree_index_end_build, }; +static const struct index_vtab memtx_multikey_tree_index_vtab = { + /* .destroy = */ memtx_tree_index_destroy, + /* .commit_create = */ generic_index_commit_create, + /* .abort_create = */ generic_index_abort_create, + /* .commit_modify = */ generic_index_commit_modify, + /* .commit_drop = */ generic_index_commit_drop, + /* .update_def = */ memtx_tree_index_update_def, + /* .depends_on_pk = */ memtx_tree_index_depends_on_pk, + /* .def_change_requires_rebuild = */ + memtx_index_def_change_requires_rebuild, + /* .size = */ memtx_tree_index_size, + /* .bsize = */ memtx_tree_index_bsize, + /* .min = */ generic_index_min, + /* .max = */ generic_index_max, + /* .random = */ memtx_tree_index_random, + /* .count = */ memtx_tree_index_count, + /* .get = */ memtx_tree_index_get, + /* .replace = */ memtx_multikey_tree_index_replace, + /* .create_iterator = */ memtx_tree_index_create_iterator, + /* .create_snapshot_iterator = */ + memtx_tree_index_create_snapshot_iterator, + /* .stat = */ generic_index_stat, + /* .compact = */ generic_index_compact, + /* .reset_stat = */ generic_index_reset_stat, + /* .begin_build = */ memtx_tree_index_begin_build, + /* .reserve = */ memtx_tree_index_reserve, + /* .build_next = */ memtx_multikey_tree_index_build_next, + /* .end_build = */ memtx_tree_index_end_build, +}; + struct index * memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def) { @@ -812,8 +978,11 @@ memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def) "malloc", "struct memtx_tree_index"); return NULL; } + const struct index_vtab *vtab = def->key_def->has_multikey_parts ? + &memtx_multikey_tree_index_vtab : + &memtx_tree_index_vtab; if (index_create(&index->base, (struct engine *)memtx, - &memtx_tree_index_vtab, def) != 0) { + vtab, def) != 0) { free(index); return NULL; } diff --git a/src/box/tuple.c b/src/box/tuple.c index 7f06d4053..c8a938c4c 100644 --- a/src/box/tuple.c +++ b/src/box/tuple.c @@ -455,7 +455,8 @@ tuple_field_go_to_key(const char **field, const char *key, int len) } int -tuple_go_to_path(const char **data, const char *path, uint32_t path_len) +tuple_go_to_path_multikey(const char **data, const char *path, + uint32_t path_len, uint64_t multikey_idx) { int rc; struct json_lexer lexer; @@ -463,6 +464,8 @@ tuple_go_to_path(const char **data, const char *path, uint32_t path_len) json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE); while ((rc = json_lexer_next_token(&lexer, &token)) == 0) { switch (token.type) { + case JSON_TOKEN_ANY: + token.num = (int)multikey_idx; case JSON_TOKEN_NUM: rc = tuple_field_go_to_index(data, token.num); break; @@ -532,7 +535,8 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple, } return tuple_field_raw_by_path(format, tuple, field_map, fieldno, path + lexer.offset, - path_len - lexer.offset, NULL); + path_len - lexer.offset, NULL, + INVALID_MULTIKEY_INDEX); } /* }}} tuple_field_* getters */ diff --git a/src/box/tuple.h b/src/box/tuple.h index 8b12fd5a8..e498e1e41 100644 --- a/src/box/tuple.h +++ b/src/box/tuple.h @@ -45,6 +45,8 @@ struct slab_arena; struct quota; struct key_part; +#define INVALID_MULTIKEY_INDEX UINT64_MAX + /** * A format for standalone tuples allocated on runtime arena. * \sa tuple_new(). @@ -286,6 +288,7 @@ box_tuple_upsert(const box_tuple_t *tuple, const char *expr, const /** \endcond public */ + /** * An atom of Tarantool storage. Represents MsgPack Array. * Tuple has the following structure: @@ -296,7 +299,9 @@ box_tuple_upsert(const box_tuple_t *tuple, const char *expr, const * | ^ * +---------------------------------------data_offset * - * Each 'off_i' is the offset to the i-th indexed field. + * Each 'off_i' is the offset to the i-th indexed field or field + * map extention (described with struct tuple_field_map_ext) + * offset (in sizeof(field_map[0]) units). */ struct PACKED tuple { @@ -328,6 +333,26 @@ struct PACKED tuple */ }; +/** + * Tuple field map extent. Is allocated on tuple_field_map_create + * call automatically when necessary, when tuple field map must be + * reallocated dynamically. + */ +struct field_map_ext { + /* Sequence of data offset slots. */ + uint32_t field_map[1]; + /* The count of field_map items. */ + uint32_t size; +}; + +static inline struct field_map_ext * +tuple_field_map_ext(uint32_t *field_map, int32_t root_offset_slot) +{ + uint32_t slot_off = field_map[root_offset_slot]; + return (struct field_map_ext *)((char *)field_map - + slot_off * sizeof(uint32_t) - sizeof(struct field_map_ext)); +} + /** Size of the tuple including size of struct tuple. */ static inline size_t tuple_size(const struct tuple *tuple) @@ -508,11 +533,32 @@ tuple_field_count(const struct tuple *tuple) * with NULL. * @param path The path to process. * @param path_len The length of the @path. + * @param multikey_index The multikey index hint - index of item + * for JSON_TOKEN_ANY level. * @retval 0 On success. * @retval -1 In case of error in JSON path. */ int -tuple_go_to_path(const char **data, const char *path, uint32_t path_len); +tuple_go_to_path_multikey(const char **data, const char *path, + uint32_t path_len, uint64_t multikey_idx); + +/** + * Retrieve msgpack data by JSON path. + * @param data[in, out] Pointer to msgpack with data. + * If the field cannot be retrieved be the + * specified path @path, it is overwritten + * with NULL. + * @param path The path to process. + * @param path_len The length of the @path. + * @retval 0 On success. + * @retval -1 In case of error in JSON path. + */ +static inline int +tuple_go_to_path(const char **data, const char *path, uint32_t path_len) +{ + return tuple_go_to_path_multikey(data, path, path_len, + INVALID_MULTIKEY_INDEX); +} /** * Get tuple field by field index and relative JSON path. @@ -533,7 +579,7 @@ static inline const char * tuple_field_raw_by_path(struct tuple_format *format, const char *tuple, const uint32_t *field_map, uint32_t fieldno, const char *path, uint32_t path_len, - int32_t *offset_slot_hint) + int32_t *offset_slot_hint, uint64_t multikey_idx) { int32_t offset_slot; if (offset_slot_hint != NULL && @@ -558,10 +604,22 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple, if (offset_slot_hint != NULL) *offset_slot_hint = offset_slot; offset_slot_access: - /* Indexed field */ - if (field_map[offset_slot] == 0) - return NULL; - tuple += field_map[offset_slot]; + if (multikey_idx != INVALID_MULTIKEY_INDEX) { + struct field_map_ext *field_map_ext = + tuple_field_map_ext((uint32_t *)field_map, + offset_slot); + if (multikey_idx >= field_map_ext->size) + return NULL; + uint32_t off = field_map_ext->field_map[-multikey_idx]; + if (off == 0) + return NULL; + tuple += off; + } else { + /* Indexed field */ + if (field_map[offset_slot] == 0) + return NULL; + tuple += field_map[offset_slot]; + } } else { uint32_t field_count; parse: @@ -572,8 +630,8 @@ parse: for (uint32_t k = 0; k < fieldno; k++) mp_next(&tuple); if (path != NULL && - unlikely(tuple_go_to_path(&tuple, path, - path_len) != 0)) + unlikely(tuple_go_to_path_multikey(&tuple, path, path_len, + multikey_idx) != 0)) return NULL; } return tuple; @@ -595,7 +653,7 @@ tuple_field_raw(struct tuple_format *format, const char *tuple, const uint32_t *field_map, uint32_t field_no) { return tuple_field_raw_by_path(format, tuple, field_map, field_no, - NULL, 0, NULL); + NULL, 0, NULL, INVALID_MULTIKEY_INDEX); } /** @@ -634,16 +692,18 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple, uint32_t path_len, uint32_t path_hash); /** - * Get a tuple field pointed to by an index part. + * Get a tuple field pointed to by an index part and multikey hint. * @param format Tuple format. * @param tuple A pointer to MessagePack array. * @param field_map A pointer to the LAST element of field map. + * @param multikey_idx A multikey hint. * @param part Index part to use. * @retval Field data if the field exists or NULL. */ static inline const char * -tuple_field_raw_by_part(struct tuple_format *format, const char *data, - const uint32_t *field_map, struct key_part *part) +tuple_field_raw_by_part_multikey(struct tuple_format *format, const char *data, + const uint32_t *field_map, + struct key_part *part, uint64_t multikey_idx) { if (unlikely(part->format_epoch != format->epoch)) { assert(format->epoch != 0); @@ -656,7 +716,41 @@ tuple_field_raw_by_part(struct tuple_format *format, const char *data, } return tuple_field_raw_by_path(format, data, field_map, part->fieldno, part->path, part->path_len, - &part->offset_slot_cache); + &part->offset_slot_cache, multikey_idx); +} + +/** + * Get a field refereed by index @part and multikey hint in tuple. + * @param tuple Tuple to get the field from. + * @param part Index part to use. + * @param multikey_idx A multikey hint. + * @retval Field data if the field exists or NULL. + */ +static inline const char * +tuple_field_by_part_multikey(const struct tuple *tuple, struct key_part *part, + uint64_t multikey_idx) +{ + return tuple_field_raw_by_part_multikey(tuple_format(tuple), + tuple_data(tuple), + tuple_field_map(tuple), part, + multikey_idx); +} + + +/** + * Get a tuple field pointed to by an index part. + * @param format Tuple format. + * @param tuple A pointer to MessagePack array. + * @param field_map A pointer to the LAST element of field map. + * @param part Index part to use. + * @retval Field data if the field exists or NULL. + */ +static inline const char * +tuple_field_raw_by_part(struct tuple_format *format, const char *data, + const uint32_t *field_map, struct key_part *part) +{ + return tuple_field_raw_by_part_multikey(format, data, field_map, + part, INVALID_MULTIKEY_INDEX); } /** diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index 86922c9bb..cdb08f8fe 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -351,6 +351,8 @@ key_def_set_hint_func(struct key_def *def) { def->key_hint = NULL; def->tuple_hint = NULL; + if (def->has_multikey_parts) + return; bool is_nullable = key_part_is_nullable(def->parts); switch (def->parts->type) { case FIELD_TYPE_UNSIGNED: { @@ -835,7 +837,8 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b, return i; } -template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +template<bool is_nullable, bool has_optional_parts, bool has_json_paths, + bool is_multikey> static inline int tuple_compare_slowpath_hinted(const struct tuple *tuple_a, uint64_t tuple_a_hint, const struct tuple *tuple_b, @@ -845,8 +848,10 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a, assert(!has_optional_parts || is_nullable); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + assert(is_multikey == key_def->has_multikey_parts); + assert(!is_multikey || is_multikey == has_json_paths); int rc = 0; - if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0) + if (!is_multikey && (rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0) return rc; struct key_part *part = key_def->parts; const char *tuple_a_raw = tuple_data(tuple_a); @@ -889,7 +894,14 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a, end = part + key_def->part_count; for (; part < end; part++) { - if (has_json_paths) { + if (is_multikey) { + field_a = tuple_field_raw_by_part_multikey(format_a, + tuple_a_raw, field_map_a, part, + tuple_a_hint); + field_b = tuple_field_raw_by_part_multikey(format_b, + tuple_b_raw, field_map_b, part, + tuple_b_hint); + } else if (has_json_paths) { field_a = tuple_field_raw_by_part(format_a, tuple_a_raw, field_map_a, part); field_b = tuple_field_raw_by_part(format_b, tuple_b_raw, @@ -946,7 +958,14 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a, */ end = key_def->parts + key_def->part_count; for (; part < end; ++part) { - if (has_json_paths) { + if (is_multikey) { + field_a = tuple_field_raw_by_part_multikey(format_a, + tuple_a_raw, field_map_a, part, + tuple_a_hint); + field_b = tuple_field_raw_by_part_multikey(format_b, + tuple_b_raw, field_map_b, part, + tuple_b_hint); + } else if (has_json_paths) { field_a = tuple_field_raw_by_part(format_a, tuple_a_raw, field_map_a, part); field_b = tuple_field_raw_by_part(format_b, tuple_b_raw, @@ -976,24 +995,28 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b, struct key_def *key_def) { return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts, - has_json_paths>(tuple_a, HINT_INVALID, tuple_b, + has_json_paths, false>(tuple_a, HINT_INVALID, tuple_b, HINT_INVALID, key_def); } -template<bool is_nullable, bool has_optional_parts, bool has_json_paths> +template<bool is_nullable, bool has_optional_parts, bool has_json_paths, + bool is_multikey> static inline int tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple, uint64_t tuple_hint, const char *key, uint32_t part_count, uint64_t key_hint, struct key_def *key_def) { + (void)key_hint; assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || is_nullable); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); assert(key != NULL || part_count == 0); assert(part_count <= key_def->part_count); + assert(is_multikey == key_def->has_multikey_parts); + assert(!is_multikey || is_multikey == has_json_paths); int rc = 0; - if ((rc = hint_cmp(tuple_hint, key_hint)) != 0) + if (!is_multikey && (rc = hint_cmp(tuple_hint, key_hint)) != 0) return rc; struct key_part *part = key_def->parts; struct tuple_format *format = tuple_format(tuple); @@ -1002,7 +1025,10 @@ tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple, enum mp_type a_type, b_type; if (likely(part_count == 1)) { const char *field; - if (has_json_paths) { + if (is_multikey) { + field = tuple_field_raw_by_part_multikey(format, + tuple_raw, field_map, part, tuple_hint); + } else if (has_json_paths) { field = tuple_field_raw_by_part(format, tuple_raw, field_map, part); } else { @@ -1032,7 +1058,10 @@ tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple, struct key_part *end = part + part_count; for (; part < end; ++part, mp_next(&key)) { const char *field; - if (has_json_paths) { + if (is_multikey) { + field = tuple_field_raw_by_part_multikey(format, + tuple_raw, field_map, part, tuple_hint); + } else if (has_json_paths) { field = tuple_field_raw_by_part(format, tuple_raw, field_map, part); } else { @@ -1074,7 +1103,7 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key, uint32_t part_count, struct key_def *key_def) { return tuple_compare_with_key_slowpath_hinted<is_nullable, - has_optional_parts, has_json_paths>(tuple, + has_optional_parts, has_json_paths, false>(tuple, HINT_INVALID, key, part_count, HINT_INVALID, key_def); } @@ -1508,14 +1537,22 @@ static const tuple_compare_t compare_slowpath_funcs[] = { }; static const tuple_compare_hinted_t compare_slowpath_hinted_funcs[] = { - tuple_compare_slowpath_hinted<false, false, false>, - tuple_compare_slowpath_hinted<true, false, false>, - tuple_compare_slowpath_hinted<false, true, false>, - tuple_compare_slowpath_hinted<true, true, false>, - tuple_compare_slowpath_hinted<false, false, true>, - tuple_compare_slowpath_hinted<true, false, true>, - tuple_compare_slowpath_hinted<false, true, true>, - tuple_compare_slowpath_hinted<true, true, true> + tuple_compare_slowpath_hinted<false, false, false, false>, + tuple_compare_slowpath_hinted<true, false, false, false>, + tuple_compare_slowpath_hinted<false, true, false, false>, + tuple_compare_slowpath_hinted<true, true, false, false>, + tuple_compare_slowpath_hinted<false, false, true, false>, + tuple_compare_slowpath_hinted<true, false, true, false>, + tuple_compare_slowpath_hinted<false, true, true, false>, + tuple_compare_slowpath_hinted<true, true, true, false>, + tuple_compare_slowpath_hinted<false, false, false, true>, + tuple_compare_slowpath_hinted<true, false, false, true>, + tuple_compare_slowpath_hinted<false, true, false, true>, + tuple_compare_slowpath_hinted<true, true, false, true>, + tuple_compare_slowpath_hinted<false, false, true, true>, + tuple_compare_slowpath_hinted<true, false, true, true>, + tuple_compare_slowpath_hinted<false, true, true, true>, + tuple_compare_slowpath_hinted<true, true, true, true>, }; void @@ -1523,7 +1560,14 @@ key_def_set_tuple_compare(struct key_def *def) { int cmp_func_idx = (def->is_nullable ? 1 : 0) + 2 * (def->has_optional_parts ? 1 : 0) + - 4 * (def->has_json_paths ? 1 : 0); + 4 * (def->has_json_paths ? 1 : 0) + + 8 * (def->has_multikey_parts ? 1 : 0); + if (def->has_multikey_parts) { + def->tuple_compare = NULL; + def->tuple_compare_hinted = + compare_slowpath_hinted_funcs[cmp_func_idx]; + return; + } if (def->is_nullable) { if (key_def_is_sequential(def)) { if (def->has_optional_parts) { @@ -1795,14 +1839,22 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = { static const tuple_compare_with_key_hinted_t compare_with_key_slowpath_hinted_funcs[] = { - tuple_compare_with_key_slowpath_hinted<false, false, false>, - tuple_compare_with_key_slowpath_hinted<true, false, false>, - tuple_compare_with_key_slowpath_hinted<false, true, false>, - tuple_compare_with_key_slowpath_hinted<true, true, false>, - tuple_compare_with_key_slowpath_hinted<false, false, true>, - tuple_compare_with_key_slowpath_hinted<true, false, true>, - tuple_compare_with_key_slowpath_hinted<false, true, true>, - tuple_compare_with_key_slowpath_hinted<true, true, true> + tuple_compare_with_key_slowpath_hinted<false, false, false, false>, + tuple_compare_with_key_slowpath_hinted<true, false, false, false>, + tuple_compare_with_key_slowpath_hinted<false, true, false, false>, + tuple_compare_with_key_slowpath_hinted<true, true, false, false>, + tuple_compare_with_key_slowpath_hinted<false, false, true, false>, + tuple_compare_with_key_slowpath_hinted<true, false, true, false>, + tuple_compare_with_key_slowpath_hinted<false, true, true, false>, + tuple_compare_with_key_slowpath_hinted<true, true, true, false>, + tuple_compare_with_key_slowpath_hinted<false, false, false, true>, + tuple_compare_with_key_slowpath_hinted<true, false, false, true>, + tuple_compare_with_key_slowpath_hinted<false, true, false, true>, + tuple_compare_with_key_slowpath_hinted<true, true, false, true>, + tuple_compare_with_key_slowpath_hinted<false, false, true, true>, + tuple_compare_with_key_slowpath_hinted<true, false, true, true>, + tuple_compare_with_key_slowpath_hinted<false, true, true, true>, + tuple_compare_with_key_slowpath_hinted<true, true, true, true> }; void @@ -1810,7 +1862,14 @@ key_def_set_tuple_compare_with_key(struct key_def *def) { int cmp_func_idx = (def->is_nullable ? 1 : 0) + 2 * (def->has_optional_parts ? 1 : 0) + - 4 * (def->has_json_paths ? 1 : 0); + 4 * (def->has_json_paths ? 1 : 0) + + 8 * (def->has_multikey_parts ? 1 : 0); + if (def->has_multikey_parts) { + def->tuple_compare_with_key = NULL; + def->tuple_compare_with_key_hinted = + compare_with_key_slowpath_hinted_funcs[cmp_func_idx]; + return; + } if (def->is_nullable) { if (key_def_is_sequential(def)) { if (def->has_optional_parts) { diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c index 4439ce3e0..99b602cd5 100644 --- a/src/box/tuple_format.c +++ b/src/box/tuple_format.c @@ -33,6 +33,7 @@ #include "json/json.h" #include "tuple_format.h" #include "coll_id_cache.h" +#include "tuple.h" #include "third_party/PMurHash.h" @@ -233,14 +234,19 @@ tuple_format_add_field(struct tuple_format *format, uint32_t fieldno, json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE); while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 && field->token.type != JSON_TOKEN_END) { - if (field->token.type == JSON_TOKEN_ANY) { - diag_set(ClientError, ER_UNSUPPORTED, - "Tarantool", "multikey indexes"); - goto fail; - } enum field_type expected_type = field->token.type == JSON_TOKEN_STR ? FIELD_TYPE_MAP : FIELD_TYPE_ARRAY; + if (field->token.type == JSON_TOKEN_ANY && + !json_token_is_multikey(&parent->token) && + !json_token_is_leaf(&parent->token)) { + assert(expected_type == FIELD_TYPE_ARRAY); + diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH, + tuple_field_path(parent), + field_type_strs[parent->type], + "multikey array"); + goto fail; + } if (field_type1_contains_type2(parent->type, expected_type)) { parent->type = expected_type; } else { @@ -475,9 +481,8 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys, break; format->min_tuple_size += mp_sizeof_nil(); } - } else { + } else if (field->token.type == JSON_TOKEN_STR) { /* Account a key string for map member. */ - assert(field->token.type == JSON_TOKEN_STR); format->min_tuple_size += mp_sizeof_str(field->token.len); } @@ -805,6 +810,59 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1, return true; } +/** + * Grow tuple_field_map allocation left with extent_size + * 0 bytes. + */ +static int +tuple_field_map_realloc(uint32_t **field_map, uint32_t *field_map_size, + uint32_t extent_size, struct region *region) +{ + assert(extent_size % sizeof(uint32_t) == 0); + uint32_t new_field_map_size = *field_map_size + extent_size; + uint32_t *new_field_map = region_alloc(region, new_field_map_size); + if (new_field_map == NULL) { + diag_set(OutOfMemory, new_field_map_size, "region_alloc", + "new_field_map"); + return -1; + } + memset(new_field_map, 0, extent_size); + if (*field_map != NULL) { + memcpy((char *)new_field_map + extent_size, + (char *)*field_map - *field_map_size, *field_map_size); + } + *field_map = (uint32_t *)((char *)new_field_map + new_field_map_size); + *field_map_size = new_field_map_size; + return 0; +} + +/** + * Retrieve tuple field map extent by root_offset_slot, reallocate + * memory if required. + */ +struct field_map_ext * +tuple_field_map_ext_retrieve(uint32_t **field_map, uint32_t *field_map_size, + int32_t root_offset_slot, uint32_t extent_items, + struct region *region) +{ + uint32_t extent_sz = sizeof(struct field_map_ext) + + extent_items * sizeof(uint32_t); + uint32_t slot_off = (*field_map)[root_offset_slot]; + if (slot_off == 0) { + /* Field map extent was not created yet. */ + slot_off = *field_map_size / sizeof(uint32_t); + (*field_map)[root_offset_slot] = slot_off; + if (tuple_field_map_realloc(field_map, field_map_size, + extent_sz, region) != 0) + return NULL; + } + struct field_map_ext *field_map_ext = + tuple_field_map_ext(*field_map, root_offset_slot); + assert(field_map_ext->size == 0 || field_map_ext->size == extent_items); + field_map_ext->size = extent_items; + return field_map_ext; +} + /** @sa declaration for details. */ int tuple_field_map_create(struct tuple_format *format, const char *tuple, @@ -817,14 +875,11 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, return 0; /* Nothing to initialize */ } struct region *region = &fiber()->gc; - *field_map_size = format->field_map_size; - *field_map = region_alloc(region, *field_map_size); - if (*field_map == NULL) { - diag_set(OutOfMemory, *field_map_size, "region_alloc", - "field_map"); + *field_map = NULL; + *field_map_size = 0; + if (tuple_field_map_realloc(field_map, field_map_size, + format->field_map_size, region) != 0) return -1; - } - *field_map = (uint32_t *)((char *)*field_map + *field_map_size); const char *pos = tuple; int rc = 0; @@ -864,11 +919,6 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, uint32_t defined_field_count = MIN(field_count, validate ? tuple_format_field_count(format) : format->index_field_count); - /* - * Nullify field map to be able to detect by 0, - * which key fields are absent in tuple_field(). - */ - memset((char *)*field_map - *field_map_size, 0, *field_map_size); /* * Prepare mp stack of the size equal to the maximum depth * of the indexed field in the format::fields tree @@ -885,6 +935,12 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, struct mp_stack stack; mp_stack_create(&stack, format->fields_depth, frames); mp_stack_push(&stack, MP_ARRAY, defined_field_count); + /** + * Pointer to the stack entry that represents the multikey + * index root ARRAY entry. + */ + struct mp_frame *mk_parent_frame = NULL; + struct tuple_field *field; struct json_token *parent = &format->fields.root; while (true) { @@ -901,6 +957,10 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, mp_stack_pop(&stack); if (mp_stack_is_empty(&stack)) goto finish; + if (json_token_is_multikey(parent)) { + /* Leave multikey index branch. */ + mk_parent_frame = NULL; + } parent = parent->parent; } /* @@ -950,8 +1010,23 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, field_type_strs[field->type]); goto error; } - if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) + if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL && + mk_parent_frame == NULL) { (*field_map)[field->offset_slot] = pos - tuple; + } else if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL && + mk_parent_frame != NULL) { + uint32_t extent_items = mk_parent_frame->count; + struct field_map_ext *field_map_ext = + tuple_field_map_ext_retrieve(field_map, + field_map_size, + field->offset_slot, + extent_items, region); + if (field_map_ext == NULL) + goto error; + int multikey_idx = mk_parent_frame->curr; + field_map_ext->field_map[ + -multikey_idx] = pos - tuple; + } if (required_fields != NULL) bit_clear(required_fields, field->id); } @@ -968,6 +1043,11 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, mp_decode_array(&pos) : mp_decode_map(&pos); mp_stack_push(&stack, type, size); + if (json_token_is_multikey(&field->token)) { + /* Track multikey entry frame. */ + assert(type == MP_ARRAY); + mk_parent_frame = &frames[stack.used - 1]; + } parent = &field->token; } else { mp_next(&pos); diff --git a/test/engine/json.result b/test/engine/json.result index a790cdfbc..5c7a9b3b3 100644 --- a/test/engine/json.result +++ b/test/engine/json.result @@ -691,7 +691,85 @@ s = box.schema.space.create('withdata') ... idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) --- -- error: Tarantool does not support multikey indexes +... +s:insert({1, 1, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}}) +--- +- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({1, 1, {{fname='Ivan', sname='Ivanych'}}}) +--- +- [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]] +... +idx:select({'James', 'Bond'}) +--- +- - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +idx:select({'Kirill', 'Shcherbatov'}) +--- +- [] +... +idx:select({'Ivan', 'Ivanych'}) +--- +- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]] +... +idx:select({'Vasya', 'Pupkin'}) +--- +- - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +idx:select() +--- +- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}}) +--- +- error: Duplicate key exists in unique index 'idx' in space 'withdata' +... +s:insert({2, 1, {{fname='James', sname='Bond'}}}) +--- +- error: Duplicate key exists in unique index 'idx' in space 'withdata' +... +idx:select() +--- +- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +idx:delete({'Vasya', 'Pupkin'}) +--- +- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}}) +--- +- [2, 1, [{'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({2, 1, {{fname='James', sname='Bond'}}}) +--- +- [2, 1, [{'fname': 'James', 'sname': 'Bond'}]] +... +idx:select() +--- +- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [2, 1, [{'fname': 'James', 'sname': 'Bond'}]] + - [2, 1, [{'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:drop() +--- +... +s = box.schema.space.create('withdata') +--- +... +pk = s:create_index('pk') +--- +... +idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}}) +--- +... +s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}}) +--- +- [1, 1, [{'fname': 'A1', 'extra': {'sname': 'A2'}}, {'fname': 'B1'}, {'fname': 'C1', + 'extra': {'sname': 'C2'}}]] ... s:drop() --- diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua index f9a7180b1..dc6016af9 100644 --- a/test/engine/json.test.lua +++ b/test/engine/json.test.lua @@ -198,4 +198,24 @@ s:drop() -- s = box.schema.space.create('withdata') idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) +s:insert({1, 1, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}}) +s:insert({1, 1, {{fname='Ivan', sname='Ivanych'}}}) +idx:select({'James', 'Bond'}) +idx:select({'Kirill', 'Shcherbatov'}) +idx:select({'Ivan', 'Ivanych'}) +idx:select({'Vasya', 'Pupkin'}) +idx:select() +s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}}) +s:insert({2, 1, {{fname='James', sname='Bond'}}}) +idx:select() +idx:delete({'Vasya', 'Pupkin'}) +s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}}) +s:insert({2, 1, {{fname='James', sname='Bond'}}}) +idx:select() +s:drop() + +s = box.schema.space.create('withdata') +pk = s:create_index('pk') +idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}}) +s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}}) s:drop() -- 2.21.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-03-20 18:08 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov 2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov 2019-03-14 7:04 ` Vladimir Davydov 2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov 2019-03-15 10:55 ` Kirill Shcherbatov 2019-03-19 19:38 ` Vladimir Davydov 2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov 2019-03-14 8:19 ` Vladimir Davydov 2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov 2019-03-20 18:08 ` Vladimir Davydov 2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox