* [PATCH v6 0/3] box: introduce hint option for memtx tree index
@ 2019-03-13 12:15 Kirill Shcherbatov
2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov
Reworked memtx tree to use 'tuple hints'.
Introduced special functions for retrieve tuple hints for a particular key_def.
Hint is an integer that can be used for tuple comparison optimization:
if a hint of one tuple is less than a hint of another then the first
tuple is definitely less than the second; only if hints are equal
tuple_compare must be called for getting comparison result.
Hints are only useful when:
* they are precalculated and stored along with the tuple;
calculation is not cheap (cheaper than tuple_compare btw) but
precalculated hints allow to compare tuples without even fetching
tuple data.
* first part of key_def is 'string', 'unsigned' or 'integer'
* since hint is calculated using only the first part of key_def
(and only first several characters if it is a string) this part
must be different with high probability for every tuple pair.
Enabled hint option improve performance on average by 15%; Select operations
are significantly accelerated (there are scenarios in which the difference
reaches 200-250%).
Also appended multikey index prototype. I am going to try to rework field_map
initialization and resend last letter a bit later.
Changes in version 6:
-- Changed hint format: now the first two bytes are reserved for type
-- Hints for SCALAR type
-- All compare routins now have two versions: hinted and non-hinted
Changes in version 5:
-- Code rewritten without classes and macro definitions using vtabs.
-- Appended multikey index prototype.
Changes in version 4:
-- Code rewritten in C++ with classes. This perform a better maintainability
in future.
-- New hints for number and boolean types. Memtx Tree is always hinted now.
-- INVALID_HINT marker. We need it because double have strange types
NaN and so on that musn't be a problem of hints business.
-- After part of code was merged, rebased patch.
Changes in version 3:
-- Better-structured code
-- Refactored all memtx indexes to use shared mempool of default size
-- Refactored all memtx indexes to hide implementation details from headers
-- Moved all hints-related code to corresponding module
-- Better types names, better comments
-- Fixed potential bug with iterators: introduce MEMTX_TREE_IDENTICAL macro
-- Fix inaccurate MEMTX_TREE_ELEM_SET usage in memtx_tree_index_build_next
-- Changed approach to calculate string hints
-- Introduce separate hint for binary collation type
Changes in version 2:
-- Splitted patch to parts in other way to decrease diff
-- Hints are not index option anymore, but default where possible
-- Removed hints for numeric types
v5: https://www.freelists.org/post/tarantool-patches/PATCH-v5-04-box-introduce-hint-option-for-memtx-tree-index
v4: https://www.freelists.org/post/tarantool-patches/PATCH-v4-07-box-introduce-hint-option-for-memtx-tree-index
v3: https://www.freelists.org/post/tarantool-patches/PATCH-v3-07-box-introduce-hint-option-for-memtx-tree-index
v2: https://www.freelists.org/post/tarantool-patches/PATCH-v2-04-box-introduce-hint-option-for-memtx-tree-index
v1: https://www.freelists.org/post/tarantool-patches/PATCH-v1-04-box-introduce-hint-option-for-memtx-tree-index
http://github.com/tarantool/tarantool/tree/kshch/gh-3961-tuple-hints
https://github.com/tarantool/tarantool/issues/3961
Kirill Shcherbatov (3):
box: refactor key_def_set_compare_func routine
memtx: introduce tuple compare hint
box: introduce multikey indexes
src/box/key_def.c | 15 +
src/box/key_def.h | 100 ++++++
src/box/memtx_tree.c | 217 +++++++++++-
src/box/tuple.c | 8 +-
src/box/tuple.h | 122 ++++++-
src/box/tuple_compare.cc | 705 +++++++++++++++++++++++++++++++++++---
src/box/tuple_format.c | 120 +++++--
src/lib/coll/coll.c | 33 ++
src/lib/coll/coll.h | 4 +
test/engine/json.result | 80 ++++-
test/engine/json.test.lua | 20 ++
11 files changed, 1330 insertions(+), 94 deletions(-)
--
2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
@ 2019-03-13 12:15 ` Kirill Shcherbatov
2019-03-14 7:04 ` Vladimir Davydov
2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov
2 siblings, 1 reply; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov
Refactored tuple_compare_create and tuple_compare_with_key_create
routines to set corresponding key_def field by it's own instead of
returning pointer to function. This is required as in further
patches we should set two key_def pointers: for hinted and for
non-hinted routine version.
Needed for #3961
---
src/box/tuple_compare.cc | 62 ++++++++++++++++++++++++----------------
1 file changed, 38 insertions(+), 24 deletions(-)
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index cf4519ccb..1bcff2ca3 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -1049,21 +1049,26 @@ static const tuple_compare_t compare_slowpath_funcs[] = {
tuple_compare_slowpath<true, true, true>
};
-static tuple_compare_t
-tuple_compare_create(const struct key_def *def)
+void
+key_def_set_tuple_compare(struct key_def *def)
{
int cmp_func_idx = (def->is_nullable ? 1 : 0) +
2 * (def->has_optional_parts ? 1 : 0) +
4 * (def->has_json_paths ? 1 : 0);
if (def->is_nullable) {
if (key_def_is_sequential(def)) {
- if (def->has_optional_parts)
- return tuple_compare_sequential<true, true>;
- else
- return tuple_compare_sequential<true, false>;
+ if (def->has_optional_parts) {
+ def->tuple_compare =
+ tuple_compare_sequential<true, true>;
+ } else {
+ def->tuple_compare =
+ tuple_compare_sequential<true, false>;
+ }
} else {
- return compare_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare =
+ compare_slowpath_funcs[cmp_func_idx];
}
+ return;
}
assert(! def->has_optional_parts);
if (!key_def_has_collation(def) && !def->has_json_paths) {
@@ -1078,13 +1083,15 @@ tuple_compare_create(const struct key_def *def)
cmp_arr[k].p[i * 2 + 1])
break;
if (i == def->part_count &&
- cmp_arr[k].p[i * 2] == UINT32_MAX)
- return cmp_arr[k].f;
+ cmp_arr[k].p[i * 2] == UINT32_MAX) {
+ def->tuple_compare = cmp_arr[k].f;
+ return;
+ }
}
}
- return key_def_is_sequential(def) ?
- tuple_compare_sequential<false, false> :
- compare_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare = key_def_is_sequential(def) ?
+ tuple_compare_sequential<false, false> :
+ compare_slowpath_funcs[cmp_func_idx];
}
/* }}} tuple_compare */
@@ -1277,8 +1284,8 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
tuple_compare_with_key_slowpath<true, true, true>
};
-static tuple_compare_with_key_t
-tuple_compare_with_key_create(const struct key_def *def)
+void
+key_def_set_tuple_compare_with_key(struct key_def *def)
{
int cmp_func_idx = (def->is_nullable ? 1 : 0) +
2 * (def->has_optional_parts ? 1 : 0) +
@@ -1286,15 +1293,19 @@ tuple_compare_with_key_create(const struct key_def *def)
if (def->is_nullable) {
if (key_def_is_sequential(def)) {
if (def->has_optional_parts) {
- return tuple_compare_with_key_sequential<true,
+ def->tuple_compare_with_key =
+ tuple_compare_with_key_sequential<true,
true>;
} else {
- return tuple_compare_with_key_sequential<true,
+ def->tuple_compare_with_key =
+ tuple_compare_with_key_sequential<true,
false>;
}
} else {
- return compare_with_key_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare_with_key =
+ compare_with_key_slowpath_funcs[cmp_func_idx];
}
+ return;
}
assert(! def->has_optional_parts);
if (!key_def_has_collation(def) && !def->has_json_paths) {
@@ -1312,13 +1323,16 @@ tuple_compare_with_key_create(const struct key_def *def)
break;
}
}
- if (i == def->part_count)
- return cmp_wk_arr[k].f;
+ if (i == def->part_count) {
+ def->tuple_compare_with_key = cmp_wk_arr[k].f;
+ return;
+ }
}
}
- return key_def_is_sequential(def) ?
- tuple_compare_with_key_sequential<false, false> :
- compare_with_key_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare_with_key =
+ key_def_is_sequential(def) ?
+ tuple_compare_with_key_sequential<false, false> :
+ compare_with_key_slowpath_funcs[cmp_func_idx];
}
/* }}} tuple_compare_with_key */
@@ -1326,6 +1340,6 @@ tuple_compare_with_key_create(const struct key_def *def)
void
key_def_set_compare_func(struct key_def *def)
{
- def->tuple_compare = tuple_compare_create(def);
- def->tuple_compare_with_key = tuple_compare_with_key_create(def);
+ key_def_set_tuple_compare(def);
+ key_def_set_tuple_compare_with_key(def);
}
--
2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 2/3] memtx: introduce tuple compare hint
2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
@ 2019-03-13 12:15 ` Kirill Shcherbatov
2019-03-14 8:19 ` Vladimir Davydov
2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov
2 siblings, 1 reply; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov
Implement functions for retrieving tuple hints for a particular
key_def. Hint is an integer that can be used for tuple comparison
optimization: if a hint of one tuple is less than a hint of another
then the first tuple is definitely less than the second; only if
hints are equal tuple_compare must be called for getting comparison
result.
Hints are calculated using only the first part of key_def.
Hints are only useful when:
* they are precalculated and stored along with the tuple;
calculation is not cheap (cheaper than tuple_compare btw) but
precalculated hints allow to compare tuples without even fetching
tuple data.
* first part of key_def is 'string'(for now)
* since hint is calculated using only the first part of key_def
(and only first several characters if it is a string) this part
must be different with high probability for every tuple pair.
Closes #3961
---
src/box/key_def.h | 95 +++++++
src/box/memtx_tree.c | 46 ++-
src/box/tuple_compare.cc | 584 +++++++++++++++++++++++++++++++++++++--
src/lib/coll/coll.c | 33 +++
src/lib/coll/coll.h | 4 +
5 files changed, 730 insertions(+), 32 deletions(-)
diff --git a/src/box/key_def.h b/src/box/key_def.h
index dd62f6a35..0d8f3112e 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -137,6 +137,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
const struct tuple *tuple_b,
struct key_def *key_def);
+/** @copydoc tuple_compare_with_key_hinted() */
+typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
+ uint64_t tuple_hint,
+ const char *key,
+ uint32_t part_count,
+ uint64_t key_hint,
+ struct key_def *key_def);
+/** @copydoc tuple_compare_hinted() */
+typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a,
+ uint64_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ uint64_t tuple_b_hint,
+ struct key_def *key_def);
/** @copydoc tuple_extract_key() */
typedef char *(*tuple_extract_key_t)(const struct tuple *tuple,
struct key_def *key_def,
@@ -152,6 +165,11 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple,
/** @copydoc key_hash() */
typedef uint32_t (*key_hash_t)(const char *key,
struct key_def *key_def);
+/** @copydoc tuple_hint() */
+typedef uint64_t (*tuple_hint_t)(const struct tuple *tuple,
+ struct key_def *key_def);
+/** @copydoc key_hint() */
+typedef uint64_t (*key_hint_t)(const char *key, struct key_def *key_def);
/* Definition of a multipart key. */
struct key_def {
@@ -159,6 +177,10 @@ struct key_def {
tuple_compare_t tuple_compare;
/** @see tuple_compare_with_key() */
tuple_compare_with_key_t tuple_compare_with_key;
+ /** @see tuple_compare_with_key_hinted() */
+ tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
+ /** @see tuple_compare_hinted() */
+ tuple_compare_hinted_t tuple_compare_hinted;
/** @see tuple_extract_key() */
tuple_extract_key_t tuple_extract_key;
/** @see tuple_extract_key_raw() */
@@ -167,6 +189,10 @@ struct key_def {
tuple_hash_t tuple_hash;
/** @see key_hash() */
key_hash_t key_hash;
+ /** @see tuple_hint() */
+ tuple_hint_t tuple_hint;
+ /** @see key_hint() */
+ key_hint_t key_hint;
/**
* Minimal part count which always is unique. For example,
* if a secondary index is unique, then
@@ -571,6 +597,51 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
return key_def->tuple_compare_with_key(tuple, key, part_count, key_def);
}
+/**
+ * Compare tuples using the key definition and comparison hints.
+ * @param tuple_a First tuple.
+ * @param tuple_a_hint Comparison hint is calculated for the
+ * @a tuple_a.
+ * @param tuple_b Second tuple.
+ * @param tuple_b_hint Comparison hint is calculated for the
+ * @a tuple_b.
+ * @param key_def Key definition.
+ * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b)
+ * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b)
+ * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b)
+ */
+static inline int
+tuple_compare_hinted(const struct tuple *tuple_a, uint64_t tuple_a_hint,
+ const struct tuple *tuple_b, uint64_t tuple_b_hint,
+ struct key_def *key_def)
+{
+ return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
+ tuple_b_hint, key_def);
+}
+
+/**
+ * Compare tuple with key using the key definition and
+ * comparison hints.
+ * @param tuple tuple
+ * @param tuple_hint Comparison hint is calculated for @a tuple.
+ * @param key key parts without MessagePack array header
+ * @param part_count the number of parts in @a key
+ * @param key_hint t Comparison key kent is calculated for @a key.
+ * @param key_def key definition
+ * @retval 0 if key_fields(tuple) == parts(key)
+ * @retval <0 if key_fields(tuple) < parts(key)
+ * @retval >0 if key_fields(tuple) > parts(key)
+ */
+static inline int
+tuple_compare_with_key_hinted(const struct tuple *tuple, uint64_t tuple_hint,
+ const char *key, uint32_t part_count,
+ uint64_t key_hint, struct key_def *key_def)
+{
+ return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
+ part_count, key_hint,
+ key_def);
+}
+
/**
* Compute hash of a tuple field.
* @param ph1 - pointer to running hash
@@ -624,6 +695,30 @@ key_hash(const char *key, struct key_def *key_def)
return key_def->key_hash(key, key_def);
}
+ /*
+ * Get a comparison hint for a @a tuple.
+ * @param tuple - tuple to get uint64_t of.
+ * @param key_def - key_def that defines which comparison is used.
+ * @return the comparison auxiliary information.
+ */
+static inline uint64_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+ return key_def->tuple_hint(tuple, key_def);
+}
+
+/**
+ * Get a comparison hint of a @a key.
+ * @param key - key to get hint of.
+ * @param key_def - key_def that defines which comparison is used.
+ * @return the comparison auxiliary information.
+ */
+static inline uint64_t
+key_hint(const char *key, struct key_def *key_def)
+{
+ return key_def->key_hint(key, key_def);
+}
+
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index d5b42efb6..2c1933ceb 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -47,6 +47,11 @@ struct memtx_tree_key_data {
const char *key;
/** Number of msgpacked search fields. */
uint32_t part_count;
+ /**
+ * Comparison hint is calculated with
+ * key_hint(key, ...).
+ */
+ uint64_t hint;
};
/**
@@ -55,6 +60,11 @@ struct memtx_tree_key_data {
struct memtx_tree_data {
/* Tuple that this node is represents. */
struct tuple *tuple;
+ /**
+ * Comparison hint is calculated with
+ * tuple_hint(tuple, ...).
+ */
+ uint64_t hint;
};
/**
@@ -69,16 +79,18 @@ static bool
memtx_tree_data_identical(const struct memtx_tree_data *a,
const struct memtx_tree_data *b)
{
- return a->tuple == b->tuple;
+ return a->tuple == b->tuple && a->hint == b->hint;
}
#define BPS_TREE_NAME memtx_tree
#define BPS_TREE_BLOCK_SIZE (512)
#define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
#define BPS_TREE_COMPARE(a, b, arg)\
- tuple_compare((&a)->tuple, (&b)->tuple, arg)
+ tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
+ (&b)->hint, arg)
#define BPS_TREE_COMPARE_KEY(a, b, arg)\
- tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg)
+ tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\
+ (b)->part_count, (b)->hint, arg)
#define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b)
#define bps_tree_elem_t struct memtx_tree_data
#define bps_tree_key_t struct memtx_tree_key_data *
@@ -113,7 +125,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c)
const struct memtx_tree_data *data_a = a;
const struct memtx_tree_data *data_b = b;
struct key_def *key_def = c;
- return tuple_compare(data_a->tuple, data_b->tuple, key_def);
+ return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple,
+ data_b->hint, key_def);
}
/* {{{ MemtxTree Iterators ****************************************/
@@ -235,9 +248,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret)
memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
/* Use user key def to save a few loops. */
if (res == NULL ||
- tuple_compare_with_key(res->tuple, it->key_data.key,
- it->key_data.part_count,
- it->index_def->key_def) != 0) {
+ tuple_compare_with_key_hinted(res->tuple, res->hint,
+ it->key_data.key,
+ it->key_data.part_count,
+ it->key_data.hint,
+ it->index_def->key_def) != 0) {
iterator->next = tree_iterator_dummie;
it->current.tuple = NULL;
*ret = NULL;
@@ -266,9 +281,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret)
memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
/* Use user key def to save a few loops. */
if (res == NULL ||
- tuple_compare_with_key(res->tuple, it->key_data.key,
- it->key_data.part_count,
- it->index_def->key_def) != 0) {
+ tuple_compare_with_key_hinted(res->tuple, res->hint,
+ it->key_data.key,
+ it->key_data.part_count,
+ it->key_data.hint,
+ it->index_def->key_def) != 0) {
iterator->next = tree_iterator_dummie;
it->current.tuple = NULL;
*ret = NULL;
@@ -516,9 +533,11 @@ memtx_tree_index_get(struct index *base, const char *key,
assert(base->def->opts.is_unique &&
part_count == base->def->key_def->part_count);
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
struct memtx_tree_key_data key_data;
key_data.key = key;
key_data.part_count = part_count;
+ key_data.hint = key_hint(key, cmp_def);
struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
*result = res != NULL ? res->tuple : NULL;
return 0;
@@ -530,9 +549,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
struct tuple **result)
{
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *key_def = index->tree.arg;
if (new_tuple) {
struct memtx_tree_data new_data;
new_data.tuple = new_tuple;
+ new_data.hint = tuple_hint(new_tuple, key_def);
struct memtx_tree_data dup_data;
dup_data.tuple = NULL;
@@ -565,6 +586,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
if (old_tuple) {
struct memtx_tree_data old_data;
old_data.tuple = old_tuple;
+ old_data.hint = tuple_hint(old_tuple, key_def);
memtx_tree_delete(&index->tree, old_data);
}
*result = old_tuple;
@@ -607,6 +629,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
it->type = type;
it->key_data.key = key;
it->key_data.part_count = part_count;
+ struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
+ it->key_data.hint = key_hint(key, cmp_def);
it->index_def = base->def;
it->tree = &index->tree;
it->tree_iterator = memtx_tree_invalid_iterator();
@@ -671,6 +695,8 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
struct memtx_tree_data *elem =
&index->build_array[index->build_array_size++];
elem->tuple = tuple;
+ struct key_def *key_def = index->tree.arg;
+ elem->hint = tuple_hint(tuple, key_def);
return 0;
}
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 1bcff2ca3..86922c9bb 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -34,6 +34,383 @@
#include "trivia/util.h" /* NOINLINE */
#include <math.h>
+/**
+ * Tuple hints
+ *
+ * Hint is a value h(t) is calculated for tuple in terms
+ * of particular key_def that has the following rules:
+ * if h(t1) < h(t2) then t1 < t2;
+ * if h(t1) > h(t2) then t1 > t2;
+ * if t1 == t2 then regular comparision is required;
+ * These rules mean that instead of direct tuple vs tuple
+ * (or tuple vs key) comparison one may compare theirs
+ * hints first; and only if theirs hints equal compare
+ * the tuples themselves.
+ *
+ * The comparison hint has the following structure:
+ * uint64_t: [ CMP_HINT_TYPE | CMP_HINT_VALUE ]
+ * 64 62 0 bit
+ *
+ * The reserved value HINT_INVALID is returned when hint is
+ * undefined.
+ */
+#define HINT_MAX (((int64_t)1 << 61) - 1)
+#define HINT_MIN (-HINT_MAX - 1)
+#define HINT_UMAX (((uint64_t)1 << 62) - 1)
+
+#define HINT_INVALID UINT64_MAX
+
+#define HINT_TYPE_NUMBER ((uint8_t)1)
+#define HINT_TYPE_STRING ((uint8_t)2)
+#define HINT_TYPE_BOOL ((uint8_t)3)
+
+#define HINT_TYPE(hint) ((uint64_t)hint >> 62)
+#define HINT_VALUE(hint) ((uint64_t)hint & (((uint64_t)1 << 62) - 1))
+#define HINT(type, value) ({ \
+ assert(HINT_TYPE(value) == 0); \
+ (((uint64_t)type << 62) | (uint64_t)value);})
+
+/**
+ * Perform hints comparison.
+ * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a;
+ * If the hints are incompatible or equal, a value of 0 is
+ * returned, and the further comparison of the original data is
+ * required.
+ */
+static int
+hint_cmp(uint64_t hint_a, uint64_t hint_b)
+{
+ if (hint_a != HINT_INVALID && hint_b != HINT_INVALID &&
+ HINT_TYPE(hint_a) == HINT_TYPE(hint_b) &&
+ HINT_VALUE(hint_a) != HINT_VALUE(hint_b))
+ return hint_a < hint_b ? -1 : 1;
+ else
+ return 0;
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_uint(const char *key, struct key_def *key_def)
+{
+ (void)key_def;
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
+ uint64_t ret;
+ if ((is_optional && key == NULL) ||
+ (is_nullable && mp_typeof(*key) == MP_NIL))
+ return HINT(HINT_TYPE_NUMBER, 0);
+ assert(mp_typeof(*key) == MP_UINT);
+ uint64_t val = mp_decode_uint(&key);
+ ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
+ return HINT(HINT_TYPE_NUMBER, ret);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_uint(const struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return HINT(HINT_TYPE_NUMBER, 0);
+ return key_hint_uint<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_int(const char *key, struct key_def *key_def)
+{
+ (void)key_def;
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_INTEGER);
+ uint64_t ret;
+ if ((is_optional && key == NULL) ||
+ (is_nullable && mp_typeof(*key) == MP_NIL))
+ return HINT(HINT_TYPE_NUMBER, 0);
+ if (mp_typeof(*key) == MP_UINT) {
+ uint64_t val = mp_decode_uint(&key);
+ ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
+ } else {
+ assert(mp_typeof(*key) == MP_INT);
+ int64_t val = mp_decode_int(&key);
+ ret = val > HINT_MAX ? HINT_UMAX :
+ val < HINT_MIN ? 0 :
+ (uint64_t)val - (uint64_t)HINT_MIN;
+ }
+ return HINT(HINT_TYPE_NUMBER, ret);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_int(const struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_INTEGER);
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return HINT(HINT_TYPE_NUMBER, 0);
+ return key_hint_int<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_number(const char *key, struct key_def *key_def)
+{
+ (void)key_def;
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_NUMBER ||
+ key_def->parts->type == FIELD_TYPE_SCALAR);
+ if ((is_optional && key == NULL) ||
+ (is_nullable && mp_typeof(*key) == MP_NIL))
+ return HINT(HINT_TYPE_NUMBER, 0);
+ uint64_t ret;
+ switch (mp_typeof(*key)) {
+ case MP_FLOAT:
+ case MP_DOUBLE: {
+ double f = mp_typeof(*key) == MP_FLOAT ?
+ mp_decode_float(&key) : mp_decode_double(&key);
+ if (!isfinite(f))
+ return HINT_INVALID;
+ double val;
+ (void)modf(f, &val);
+ ret = val > (double)HINT_MAX ? HINT_UMAX :
+ val < (double)HINT_MIN ? 0 :
+ (uint64_t)val - (uint64_t)HINT_MIN;
+ break;
+ }
+ case MP_INT: {
+ int64_t val = (uint64_t)mp_decode_int(&key);
+ ret = val > HINT_MAX ? HINT_UMAX :
+ val < HINT_MIN ? 0 :
+ (uint64_t)val - (uint64_t)HINT_MIN;
+ break;
+ }
+ case MP_UINT: {
+ uint64_t val = mp_decode_uint(&key);
+ ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
+ break;
+ }
+ default:
+ unreachable();
+ }
+ return HINT(HINT_TYPE_NUMBER, ret);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_number(const struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_NUMBER);
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return HINT(HINT_TYPE_NUMBER, 0);
+ return key_hint_number<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_boolean(const char *key, struct key_def *key_def)
+{
+ (void)key_def;
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_BOOLEAN ||
+ key_def->parts->type == FIELD_TYPE_SCALAR);
+ if ((is_optional && key == NULL) ||
+ (is_nullable && mp_typeof(*key) == MP_NIL))
+ return HINT(HINT_TYPE_BOOL, 0);
+ return (uint64_t)mp_decode_bool(&key);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_boolean(const struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_BOOLEAN);
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return HINT(HINT_TYPE_BOOL, 0);
+ return key_hint_boolean<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_string(const char *key, struct key_def *key_def)
+{
+ (void)key_def;
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->coll == NULL);
+ assert(key_def->parts->type == FIELD_TYPE_STRING ||
+ key_def->parts->type == FIELD_TYPE_SCALAR);
+ if ((is_optional && key == NULL) ||
+ (is_nullable && mp_typeof(*key) == MP_NIL))
+ return HINT(HINT_TYPE_STRING, 0);
+ assert(mp_typeof(*key) == MP_STR);
+ uint32_t len;
+ const unsigned char *str =
+ (const unsigned char *)mp_decode_str(&key, &len);
+ uint64_t result = 0;
+ uint32_t process_len = MIN(len, 7);
+ for (uint32_t i = 0; i < process_len; i++) {
+ result <<= 8;
+ result |= str[i];
+ }
+ result <<= 8 * (7 - process_len);
+ return HINT(HINT_TYPE_STRING, result);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_string(const struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->coll == NULL);
+ assert(key_def->parts->type == FIELD_TYPE_STRING);
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return HINT(HINT_TYPE_STRING, 0);
+ return key_hint_string<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_string_coll(const char *key, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert((key_def->parts->type == FIELD_TYPE_STRING ||
+ key_def->parts->type == FIELD_TYPE_SCALAR) &&
+ key_def->parts->coll != NULL);
+ if ((is_optional && key == NULL) ||
+ (is_nullable && mp_typeof(*key) == MP_NIL))
+ return HINT(HINT_TYPE_STRING, 0);
+ assert(mp_typeof(*key) == MP_STR);
+ uint32_t len;
+ const char *str = mp_decode_str(&key, &len);
+ return key_def->parts->coll->hint(str, len, key_def->parts->coll);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_string_coll(const struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_STRING &&
+ key_def->parts->coll != NULL);
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return HINT(HINT_TYPE_STRING, 0);
+ return key_hint_string_coll<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_scalar(const char *key, struct key_def *key_def)
+{
+ (void)key_def;
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_SCALAR);
+ if ((is_optional && key == NULL) ||
+ (is_nullable && mp_typeof(*key) == MP_NIL))
+ return HINT(HINT_TYPE_BOOL, 0);
+ switch (mp_typeof(*key)) {
+ case MP_INT:
+ case MP_UINT:
+ case MP_FLOAT:
+ case MP_DOUBLE:
+ return key_hint_number<is_optional, is_nullable>(key, key_def);
+ case MP_BOOL:
+ return key_hint_boolean<is_optional, is_nullable>(key, key_def);
+ case MP_STR:
+ if (key_def->parts->coll == NULL)
+ return key_hint_string<is_optional,
+ is_nullable>(key, key_def);
+ else
+ return key_hint_string_coll<is_optional,
+ is_nullable>(key, key_def);
+ default:
+ return HINT_INVALID;
+ }
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_scalar(const struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(key_def->parts->type == FIELD_TYPE_SCALAR);
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return HINT(HINT_TYPE_BOOL, 0);
+ return key_hint_scalar<false, is_nullable>(field, key_def);
+}
+
+void
+key_def_set_hint_func(struct key_def *def)
+{
+ def->key_hint = NULL;
+ def->tuple_hint = NULL;
+ bool is_nullable = key_part_is_nullable(def->parts);
+ switch (def->parts->type) {
+ case FIELD_TYPE_UNSIGNED: {
+ def->key_hint = is_nullable ? key_hint_uint<true, true> :
+ key_hint_uint<true, false>;
+ def->tuple_hint = is_nullable ? tuple_hint_uint<true> :
+ tuple_hint_uint<false>;
+ break;
+ }
+ case FIELD_TYPE_INTEGER: {
+ def->key_hint = is_nullable ? key_hint_int<true, true> :
+ key_hint_int<true, false>;
+ def->tuple_hint = is_nullable ? tuple_hint_int<true> :
+ tuple_hint_int<false>;
+ break;
+ }
+ case FIELD_TYPE_STRING: {
+ if (def->parts->coll != NULL) {
+ def->key_hint =
+ is_nullable ? key_hint_string_coll<true, true> :
+ key_hint_string_coll<true, false>;
+ def->tuple_hint =
+ is_nullable ? tuple_hint_string_coll<true> :
+ tuple_hint_string_coll<false>;
+ } else {
+ def->key_hint =
+ is_nullable ? key_hint_string<true, true> :
+ key_hint_string<true, false>;
+ def->tuple_hint = is_nullable ? tuple_hint_string<true> :
+ tuple_hint_string<false>;
+ }
+ break;
+ }
+ case FIELD_TYPE_NUMBER: {
+ def->key_hint = is_nullable ? key_hint_number<true, true> :
+ key_hint_number<true, false>;
+ def->tuple_hint = is_nullable ? tuple_hint_number<true> :
+ tuple_hint_number<false>;
+ break;
+ }
+ case FIELD_TYPE_BOOLEAN: {
+ def->key_hint = is_nullable ? key_hint_boolean<true, true> :
+ key_hint_boolean<true, false>;
+ def->tuple_hint = is_nullable ? tuple_hint_boolean<true> :
+ tuple_hint_boolean<false>;
+ break;
+ }
+ case FIELD_TYPE_SCALAR: {
+ def->key_hint = is_nullable ? key_hint_scalar<true, true> :
+ key_hint_scalar<true, false>;
+ def->tuple_hint = is_nullable ? tuple_hint_scalar<true> :
+ tuple_hint_scalar<false>;
+ break;
+ }
+ default: {
+ break;
+ }
+ }
+}
+
/* {{{ tuple_compare */
/*
@@ -460,13 +837,17 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
static inline int
-tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
- struct key_def *key_def)
+tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
+ uint64_t tuple_a_hint, const struct tuple *tuple_b,
+ uint64_t tuple_b_hint, struct key_def *key_def)
{
assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
+ int rc = 0;
+ if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
+ return rc;
struct key_part *part = key_def->parts;
const char *tuple_a_raw = tuple_data(tuple_a);
const char *tuple_b_raw = tuple_data(tuple_b);
@@ -502,7 +883,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
struct key_part *end;
const char *field_a, *field_b;
enum mp_type a_type, b_type;
- int rc;
if (is_nullable)
end = part + key_def->unique_part_count;
else
@@ -592,8 +972,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
static inline int
-tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
- uint32_t part_count, struct key_def *key_def)
+tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
+ struct key_def *key_def)
+{
+ return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts,
+ has_json_paths>(tuple_a, HINT_INVALID, tuple_b,
+ HINT_INVALID, key_def);
+}
+
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
+ uint64_t tuple_hint, const char *key, uint32_t part_count,
+ uint64_t key_hint, struct key_def *key_def)
{
assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
@@ -601,6 +992,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
assert(has_optional_parts == key_def->has_optional_parts);
assert(key != NULL || part_count == 0);
assert(part_count <= key_def->part_count);
+ int rc = 0;
+ if ((rc = hint_cmp(tuple_hint, key_hint)) != 0)
+ return rc;
struct key_part *part = key_def->parts;
struct tuple_format *format = tuple_format(tuple);
const char *tuple_raw = tuple_data(tuple);
@@ -636,7 +1030,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
}
struct key_part *end = part + part_count;
- int rc;
for (; part < end; ++part, mp_next(&key)) {
const char *field;
if (has_json_paths) {
@@ -675,6 +1068,17 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
return 0;
}
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
+ uint32_t part_count, struct key_def *key_def)
+{
+ return tuple_compare_with_key_slowpath_hinted<is_nullable,
+ has_optional_parts, has_json_paths>(tuple,
+ HINT_INVALID, key, part_count,
+ HINT_INVALID, key_def);
+}
+
template<bool is_nullable>
static inline int
key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
@@ -733,13 +1137,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
template<bool is_nullable, bool has_optional_parts>
static inline int
-tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
- uint32_t part_count, struct key_def *key_def)
+tuple_compare_with_key_sequential_hinted(const struct tuple *tuple,
+ uint64_t tuple_hint, const char *key, uint32_t part_count,
+ uint64_t key_hint, struct key_def *key_def)
{
assert(!has_optional_parts || is_nullable);
assert(key_def_is_sequential(key_def));
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
const char *tuple_key = tuple_data(tuple);
uint32_t field_count = mp_decode_array(&tuple_key);
uint32_t cmp_part_count;
@@ -749,8 +1157,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
assert(field_count >= part_count);
cmp_part_count = part_count;
}
- int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
- key_def);
+ rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
+ key_def);
if (!has_optional_parts || rc != 0)
return rc;
/*
@@ -772,6 +1180,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
return 0;
}
+template<bool is_nullable, bool has_optional_parts>
+static inline int
+tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
+ uint32_t part_count, struct key_def *key_def)
+{
+ return tuple_compare_with_key_sequential_hinted<is_nullable,
+ has_optional_parts>(tuple, HINT_INVALID, key,
+ part_count, HINT_INVALID, key_def);
+}
+
int
key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
{
@@ -792,9 +1210,13 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
template <bool is_nullable, bool has_optional_parts>
static int
-tuple_compare_sequential(const struct tuple *tuple_a,
- const struct tuple *tuple_b, key_def *key_def)
+tuple_compare_sequential_hinted(const struct tuple *tuple_a,
+ uint64_t tuple_a_hint, const struct tuple *tuple_b,
+ uint64_t tuple_b_hint, key_def *key_def)
{
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
assert(!has_optional_parts || is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
assert(key_def_is_sequential(key_def));
@@ -812,7 +1234,6 @@ tuple_compare_sequential(const struct tuple *tuple_a,
bool was_null_met = false;
struct key_part *part = key_def->parts;
struct key_part *end = part + key_def->unique_part_count;
- int rc;
uint32_t i = 0;
for (; part < end; ++part, ++i) {
enum mp_type a_type, b_type;
@@ -859,6 +1280,16 @@ tuple_compare_sequential(const struct tuple *tuple_a,
return 0;
}
+template <bool is_nullable, bool has_optional_parts>
+static int
+tuple_compare_sequential(const struct tuple *tuple_a,
+ const struct tuple *tuple_b, key_def *key_def)
+{
+ return tuple_compare_sequential_hinted<is_nullable,
+ has_optional_parts>(tuple_a, HINT_INVALID, tuple_b,
+ HINT_INVALID, key_def);
+}
+
template <int TYPE>
static inline int
field_compare(const char **field_a, const char **field_b);
@@ -989,6 +1420,18 @@ struct TupleCompare
compare(tuple_a, tuple_b, format_a,
format_b, field_a, field_b);
}
+
+ static int compare_hinted(const struct tuple *tuple_a,
+ uint64_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ uint64_t tuple_b_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple_a, tuple_b, key_def);
+ }
};
template <int TYPE, int ...MORE_TYPES>
@@ -1006,15 +1449,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> {
return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b,
format_a, format_b, field_a, field_b);
}
+
+ static int compare_hinted(const struct tuple *tuple_a,
+ uint64_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ uint64_t tuple_b_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple_a, tuple_b, key_def);
+ }
};
} /* end of anonymous namespace */
struct comparator_signature {
tuple_compare_t f;
+ tuple_compare_hinted_t f_hinted;
uint32_t p[64];
};
#define COMPARATOR(...) \
- { TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } },
+ { TupleCompare<__VA_ARGS__>::compare, \
+ TupleCompare<__VA_ARGS__>::compare_hinted, \
+ { __VA_ARGS__, UINT32_MAX } },
/**
* field1 no, field1 type, field2 no, field2 type, ...
@@ -1049,6 +1507,17 @@ static const tuple_compare_t compare_slowpath_funcs[] = {
tuple_compare_slowpath<true, true, true>
};
+static const tuple_compare_hinted_t compare_slowpath_hinted_funcs[] = {
+ tuple_compare_slowpath_hinted<false, false, false>,
+ tuple_compare_slowpath_hinted<true, false, false>,
+ tuple_compare_slowpath_hinted<false, true, false>,
+ tuple_compare_slowpath_hinted<true, true, false>,
+ tuple_compare_slowpath_hinted<false, false, true>,
+ tuple_compare_slowpath_hinted<true, false, true>,
+ tuple_compare_slowpath_hinted<false, true, true>,
+ tuple_compare_slowpath_hinted<true, true, true>
+};
+
void
key_def_set_tuple_compare(struct key_def *def)
{
@@ -1060,13 +1529,21 @@ key_def_set_tuple_compare(struct key_def *def)
if (def->has_optional_parts) {
def->tuple_compare =
tuple_compare_sequential<true, true>;
+ def->tuple_compare_hinted =
+ tuple_compare_sequential_hinted<true,
+ true>;
} else {
def->tuple_compare =
tuple_compare_sequential<true, false>;
+ def->tuple_compare_hinted =
+ tuple_compare_sequential_hinted<true,
+ false>;
}
} else {
def->tuple_compare =
compare_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare_hinted =
+ compare_slowpath_hinted_funcs[cmp_func_idx];
}
return;
}
@@ -1085,13 +1562,20 @@ key_def_set_tuple_compare(struct key_def *def)
if (i == def->part_count &&
cmp_arr[k].p[i * 2] == UINT32_MAX) {
def->tuple_compare = cmp_arr[k].f;
+ def->tuple_compare_hinted = cmp_arr[k].f_hinted;
return;
}
}
}
- def->tuple_compare = key_def_is_sequential(def) ?
- tuple_compare_sequential<false, false> :
- compare_slowpath_funcs[cmp_func_idx];
+ if (key_def_is_sequential(def)) {
+ def->tuple_compare = tuple_compare_sequential<false, false>;
+ def->tuple_compare_hinted =
+ tuple_compare_sequential_hinted<false, false>;
+ } else {
+ def->tuple_compare = compare_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare_hinted =
+ compare_slowpath_hinted_funcs[cmp_func_idx];
+ }
}
/* }}} tuple_compare */
@@ -1222,6 +1706,17 @@ struct TupleCompareWithKey
compare(tuple, key, part_count,
key_def, format, field);
}
+
+ static int
+ compare_hinted(const struct tuple *tuple, uint64_t tuple_hint,
+ const char *key, uint32_t part_count, uint64_t key_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple, key, part_count, key_def);
+ }
};
template <int TYPE, int ...MORE_TYPES>
@@ -1242,6 +1737,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
compare(tuple, key, part_count,
key_def, format, field);
}
+
+ static int
+ compare_hinted(const struct tuple *tuple, uint64_t tuple_hint,
+ const char *key, uint32_t part_count, uint64_t key_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple, key, part_count, key_def);
+ }
};
} /* end of anonymous namespace */
@@ -1249,11 +1755,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
struct comparator_with_key_signature
{
tuple_compare_with_key_t f;
+ tuple_compare_with_key_hinted_t f_hinted;
uint32_t p[64];
};
#define KEY_COMPARATOR(...) \
- { TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
+ { TupleCompareWithKey<0, __VA_ARGS__>::compare, \
+ TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \
+ { __VA_ARGS__ } },
static const comparator_with_key_signature cmp_wk_arr[] = {
KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
@@ -1284,6 +1793,18 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
tuple_compare_with_key_slowpath<true, true, true>
};
+static const tuple_compare_with_key_hinted_t
+ compare_with_key_slowpath_hinted_funcs[] = {
+ tuple_compare_with_key_slowpath_hinted<false, false, false>,
+ tuple_compare_with_key_slowpath_hinted<true, false, false>,
+ tuple_compare_with_key_slowpath_hinted<false, true, false>,
+ tuple_compare_with_key_slowpath_hinted<true, true, false>,
+ tuple_compare_with_key_slowpath_hinted<false, false, true>,
+ tuple_compare_with_key_slowpath_hinted<true, false, true>,
+ tuple_compare_with_key_slowpath_hinted<false, true, true>,
+ tuple_compare_with_key_slowpath_hinted<true, true, true>
+};
+
void
key_def_set_tuple_compare_with_key(struct key_def *def)
{
@@ -1296,14 +1817,23 @@ key_def_set_tuple_compare_with_key(struct key_def *def)
def->tuple_compare_with_key =
tuple_compare_with_key_sequential<true,
true>;
+ def->tuple_compare_with_key_hinted =
+ tuple_compare_with_key_sequential_hinted<
+ true, true>;
} else {
def->tuple_compare_with_key =
tuple_compare_with_key_sequential<true,
false>;
+ def->tuple_compare_with_key_hinted =
+ tuple_compare_with_key_sequential_hinted<
+ true, false>;
}
} else {
def->tuple_compare_with_key =
compare_with_key_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare_with_key_hinted =
+ compare_with_key_slowpath_hinted_funcs[
+ cmp_func_idx];
}
return;
}
@@ -1325,14 +1855,23 @@ key_def_set_tuple_compare_with_key(struct key_def *def)
}
if (i == def->part_count) {
def->tuple_compare_with_key = cmp_wk_arr[k].f;
+ def->tuple_compare_with_key_hinted =
+ cmp_wk_arr[k].f_hinted;
return;
}
}
}
- def->tuple_compare_with_key =
- key_def_is_sequential(def) ?
- tuple_compare_with_key_sequential<false, false> :
- compare_with_key_slowpath_funcs[cmp_func_idx];
+ if (key_def_is_sequential(def)) {
+ def->tuple_compare_with_key =
+ tuple_compare_with_key_sequential<false, false>;
+ def->tuple_compare_with_key_hinted =
+ tuple_compare_with_key_sequential_hinted<false, false>;
+ } else {
+ def->tuple_compare_with_key =
+ compare_with_key_slowpath_funcs[cmp_func_idx];
+ def->tuple_compare_with_key_hinted =
+ compare_with_key_slowpath_hinted_funcs[cmp_func_idx];
+ }
}
/* }}} tuple_compare_with_key */
@@ -1342,4 +1881,5 @@ key_def_set_compare_func(struct key_def *def)
{
key_def_set_tuple_compare(def);
key_def_set_tuple_compare_with_key(def);
+ key_def_set_hint_func(def);
}
diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
index afc15e809..3bd7c44f5 100644
--- a/src/lib/coll/coll.c
+++ b/src/lib/coll/coll.c
@@ -133,6 +133,37 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
return s_len;
}
+/** Get a compare hint of a binary collation. */
+static uint64_t
+coll_bin_hint(const char *s, size_t s_len, struct coll *coll)
+{
+ (void)coll;
+ uint64_t result = 0;
+ uint32_t process_len = MIN(s_len, 7);
+ for (uint32_t i = 0; i < process_len; i++) {
+ result <<= 8;
+ result |= ((unsigned char *)s)[i];
+ }
+ result <<= 8 * (7 - process_len);
+ return result;
+}
+
+/** Get a compare hint of a string using ICU collation. */
+static uint64_t
+coll_icu_hint(const char *s, size_t s_len, struct coll *coll)
+{
+ assert(coll->type == COLL_TYPE_ICU);
+ UCharIterator itr;
+ uiter_setUTF8(&itr, s, s_len);
+ uint8_t buf[7];
+ uint32_t state[2] = {0, 0};
+ UErrorCode status = U_ZERO_ERROR;
+ int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state, buf,
+ sizeof(buf), &status);
+ assert(got >= 0 && got <= 7);
+ return coll_bin_hint((const char *)buf, got, NULL);
+}
+
/**
* Set up ICU collator and init cmp and hash members of collation.
* @param coll Collation to set up.
@@ -242,6 +273,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
}
coll->cmp = coll_icu_cmp;
coll->hash = coll_icu_hash;
+ coll->hint = coll_icu_hint;
return 0;
}
@@ -356,6 +388,7 @@ coll_new(const struct coll_def *def)
coll->collator = NULL;
coll->cmp = coll_bin_cmp;
coll->hash = coll_bin_hash;
+ coll->hint = coll_bin_hint;
break;
default:
unreachable();
diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h
index 8c9f94293..d695a02ad 100644
--- a/src/lib/coll/coll.h
+++ b/src/lib/coll/coll.h
@@ -48,6 +48,8 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
uint32_t *pcarry, struct coll *coll);
+typedef uint64_t (*coll_hint_f)(const char *s, size_t s_len, struct coll *coll);
+
struct UCollator;
/**
@@ -62,6 +64,8 @@ struct coll {
/** String comparator. */
coll_cmp_f cmp;
coll_hash_f hash;
+ /** The pointer to routine calculating tuple hint. */
+ coll_hint_f hint;
/** Reference counter. */
int refs;
/**
--
2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 3/3] box: introduce multikey indexes
2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
@ 2019-03-13 12:15 ` Kirill Shcherbatov
2 siblings, 0 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov
I didn't finished review fixes for multikey indexes, but I've
included this part to show how the modified hints changes affect
multikey indexes.
=========================================================================
With multikey index feature introduction, JSON path may have [*]
placeholder instead of array index: this allows to index
multiple documents by one JSON path automatically.
Example:
s = box.schema.space.create('withdata')
idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'},
{3, 'str', path = '[*].sname'}}})
_ = s:insert({1, 1, {{fname='James', sname='Bond'},
{fname='Austin', sname='Powers'}}})
idx:get({'Austin', 'Powers'})
---
- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Austin',
'sname': 'Powers'}]]
...
Closes #1257
---
src/box/key_def.c | 15 ++++
src/box/key_def.h | 5 ++
src/box/memtx_tree.c | 175 +++++++++++++++++++++++++++++++++++++-
src/box/tuple.c | 8 +-
src/box/tuple.h | 122 +++++++++++++++++++++++---
src/box/tuple_compare.cc | 115 +++++++++++++++++++------
src/box/tuple_format.c | 120 +++++++++++++++++++++-----
test/engine/json.result | 80 ++++++++++++++++-
test/engine/json.test.lua | 20 +++++
9 files changed, 592 insertions(+), 68 deletions(-)
diff --git a/src/box/key_def.c b/src/box/key_def.c
index d4c979aa1..b9cc77699 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -164,9 +164,24 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
*path_pool += path_len;
memcpy(def->parts[part_no].path, path, path_len);
def->parts[part_no].path_len = path_len;
+
+ int rc;
+ struct json_lexer lexer;
+ struct json_token token;
+ json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
+ while ((rc = json_lexer_next_token(&lexer, &token)) == 0) {
+ if (token.type == JSON_TOKEN_ANY) {
+ def->has_multikey_parts = true;
+ def->parts[part_no].is_multikey = true;
+ break;
+ } else if (token.type == JSON_TOKEN_END) {
+ break;
+ }
+ }
} else {
def->parts[part_no].path = NULL;
def->parts[part_no].path_len = 0;
+ def->parts[part_no].is_multikey = false;
}
column_mask_set_fieldno(&def->column_mask, fieldno);
}
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 0d8f3112e..6a914df11 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -97,6 +97,8 @@ struct key_part {
char *path;
/** The length of JSON path. */
uint32_t path_len;
+ /** True if this is multikey key part. */
+ bool is_multikey;
/**
* Epoch of the tuple format the offset slot cached in
* this part is valid for, see tuple_format::epoch.
@@ -205,6 +207,8 @@ struct key_def {
bool is_nullable;
/** True if some key part has JSON path. */
bool has_json_paths;
+ /** True if some part has array index placeholder *. */
+ bool has_multikey_parts;
/**
* True, if some key parts can be absent in a tuple. These
* fields assumed to be MP_NIL.
@@ -699,6 +703,7 @@ key_hash(const char *key, struct key_def *key_def)
* Get a comparison hint for a @a tuple.
* @param tuple - tuple to get uint64_t of.
* @param key_def - key_def that defines which comparison is used.
+ * @param multikey_idx - index of multikey array item.
* @return the comparison auxiliary information.
*/
static inline uint64_t
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index 2c1933ceb..bfc618bdf 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -537,7 +537,8 @@ memtx_tree_index_get(struct index *base, const char *key,
struct memtx_tree_key_data key_data;
key_data.key = key;
key_data.part_count = part_count;
- key_data.hint = key_hint(key, cmp_def);
+ if (!cmp_def->has_multikey_parts)
+ key_data.hint = key_hint(key, cmp_def);
struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
*result = res != NULL ? res->tuple : NULL;
return 0;
@@ -593,6 +594,98 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
return 0;
}
+static int
+multikey_get_array_sz(struct tuple *tuple, struct key_def *key_def)
+{
+ assert(key_def->has_multikey_parts);
+ struct key_part *part = key_def->parts;
+ /* FIXME: don't like it... */
+ while (part < key_def->parts + key_def->part_count &&
+ !part->is_multikey)
+ part++;
+ assert(part->path != NULL && part->is_multikey);
+ struct tuple_field *field =
+ tuple_format_field_by_path(tuple_format(tuple), part->fieldno,
+ part->path, part->path_len);
+ assert(field != NULL);
+ struct field_map_ext *field_map_ext =
+ tuple_field_map_ext((uint32_t *)tuple_field_map(tuple),
+ field->offset_slot);
+ return field_map_ext->size;
+}
+
+int
+memtx_multikey_tree_index_replace(struct index *base, struct tuple *old_tuple,
+ struct tuple *new_tuple,
+ enum dup_replace_mode mode,
+ struct tuple **result)
+{
+ struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *key_def = index->tree.arg;
+ if (new_tuple != NULL) {
+ int errcode = 0, tree_res = 0;
+ struct tuple *dup_tuple = NULL;
+ int multikey_idx = 0;
+ int sz = multikey_get_array_sz(new_tuple, key_def);
+ for (; multikey_idx < sz; multikey_idx++) {
+ struct memtx_tree_data new_data;
+ new_data.tuple = new_tuple;
+ new_data.hint = multikey_idx;
+ struct memtx_tree_data dup_data;
+ dup_data.tuple = NULL;
+ tree_res = memtx_tree_insert(&index->tree, new_data,
+ &dup_data);
+ if (tree_res != 0) {
+ diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
+ "memtx_tree_index", "replace");
+ break;
+ }
+ dup_tuple = dup_data.tuple;
+ errcode = replace_check_dup(old_tuple, dup_tuple, mode);
+ if (errcode != 0) {
+ memtx_tree_delete(&index->tree, new_data);
+ if (dup_tuple != NULL) {
+ memtx_tree_insert(&index->tree,
+ dup_data, NULL);
+ }
+ struct space *sp =
+ space_cache_find(base->def->space_id);
+ if (sp != NULL) {
+ diag_set(ClientError, errcode,
+ base->def->name,
+ space_name(sp));
+ }
+ break;
+ }
+ }
+ if (tree_res != 0 || errcode != 0) {
+ multikey_idx--;
+ for (; multikey_idx >= 0; multikey_idx--) {
+ struct memtx_tree_data data;
+ data.tuple = new_tuple;
+ data.hint = multikey_idx;
+ memtx_tree_delete(&index->tree, data);
+ }
+ return -1;
+ }
+ if (dup_tuple != NULL) {
+ *result = dup_tuple;
+ return 0;
+ }
+ }
+ if (old_tuple != NULL) {
+ int sz = multikey_get_array_sz(old_tuple, key_def);
+ for (int multikey_idx = 0; multikey_idx < sz; multikey_idx++) {
+ struct memtx_tree_data old_data;
+ old_data.tuple = old_tuple;
+ old_data.hint = multikey_idx;
+ memtx_tree_delete(&index->tree, old_data);
+ }
+ }
+ *result = old_tuple;
+ return 0;
+}
+
static struct iterator *
memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
const char *key, uint32_t part_count)
@@ -630,7 +723,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
it->key_data.key = key;
it->key_data.part_count = part_count;
struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
- it->key_data.hint = key_hint(key, cmp_def);
+ if (!cmp_def->has_multikey_parts)
+ it->key_data.hint = key_hint(key, cmp_def);
it->index_def = base->def;
it->tree = &index->tree;
it->tree_iterator = memtx_tree_invalid_iterator();
@@ -700,6 +794,48 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
return 0;
}
+static int
+memtx_multikey_tree_index_build_next(struct index *base, struct tuple *tuple)
+{
+ struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *key_def = index->tree.arg;
+ int sz = multikey_get_array_sz(tuple, key_def);
+
+ if (index->build_array == NULL) {
+ index->build_array =
+ (struct memtx_tree_data *)malloc(MEMTX_EXTENT_SIZE);
+ if (index->build_array == NULL) {
+ diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
+ "memtx_tree_index", "build_next");
+ return -1;
+ }
+ index->build_array_alloc_size =
+ MEMTX_EXTENT_SIZE / sizeof(index->build_array[0]);
+ }
+ assert(index->build_array_size <= index->build_array_alloc_size);
+ if (index->build_array_size == index->build_array_alloc_size) {
+ index->build_array_alloc_size = index->build_array_alloc_size +
+ index->build_array_alloc_size / 2;
+ struct memtx_tree_data *tmp =
+ realloc(index->build_array,
+ index->build_array_alloc_size * sizeof(*tmp));
+ if (tmp == NULL) {
+ diag_set(OutOfMemory, index->build_array_alloc_size *
+ sizeof(*tmp), "memtx_tree_index", "build_next");
+ return -1;
+ }
+ index->build_array = tmp;
+ }
+ for (int multikey_idx = 0; multikey_idx < sz; multikey_idx++) {
+ struct memtx_tree_data *elem =
+ &index->build_array[index->build_array_size++];
+ assert(index->build_array_size < index->build_array_alloc_size);
+ elem->tuple = tuple;
+ elem->hint = multikey_idx;
+ }
+ return 0;
+}
+
static void
memtx_tree_index_end_build(struct index *base)
{
@@ -802,6 +938,36 @@ static const struct index_vtab memtx_tree_index_vtab = {
/* .end_build = */ memtx_tree_index_end_build,
};
+static const struct index_vtab memtx_multikey_tree_index_vtab = {
+ /* .destroy = */ memtx_tree_index_destroy,
+ /* .commit_create = */ generic_index_commit_create,
+ /* .abort_create = */ generic_index_abort_create,
+ /* .commit_modify = */ generic_index_commit_modify,
+ /* .commit_drop = */ generic_index_commit_drop,
+ /* .update_def = */ memtx_tree_index_update_def,
+ /* .depends_on_pk = */ memtx_tree_index_depends_on_pk,
+ /* .def_change_requires_rebuild = */
+ memtx_index_def_change_requires_rebuild,
+ /* .size = */ memtx_tree_index_size,
+ /* .bsize = */ memtx_tree_index_bsize,
+ /* .min = */ generic_index_min,
+ /* .max = */ generic_index_max,
+ /* .random = */ memtx_tree_index_random,
+ /* .count = */ memtx_tree_index_count,
+ /* .get = */ memtx_tree_index_get,
+ /* .replace = */ memtx_multikey_tree_index_replace,
+ /* .create_iterator = */ memtx_tree_index_create_iterator,
+ /* .create_snapshot_iterator = */
+ memtx_tree_index_create_snapshot_iterator,
+ /* .stat = */ generic_index_stat,
+ /* .compact = */ generic_index_compact,
+ /* .reset_stat = */ generic_index_reset_stat,
+ /* .begin_build = */ memtx_tree_index_begin_build,
+ /* .reserve = */ memtx_tree_index_reserve,
+ /* .build_next = */ memtx_multikey_tree_index_build_next,
+ /* .end_build = */ memtx_tree_index_end_build,
+};
+
struct index *
memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def)
{
@@ -812,8 +978,11 @@ memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def)
"malloc", "struct memtx_tree_index");
return NULL;
}
+ const struct index_vtab *vtab = def->key_def->has_multikey_parts ?
+ &memtx_multikey_tree_index_vtab :
+ &memtx_tree_index_vtab;
if (index_create(&index->base, (struct engine *)memtx,
- &memtx_tree_index_vtab, def) != 0) {
+ vtab, def) != 0) {
free(index);
return NULL;
}
diff --git a/src/box/tuple.c b/src/box/tuple.c
index 7f06d4053..c8a938c4c 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -455,7 +455,8 @@ tuple_field_go_to_key(const char **field, const char *key, int len)
}
int
-tuple_go_to_path(const char **data, const char *path, uint32_t path_len)
+tuple_go_to_path_multikey(const char **data, const char *path,
+ uint32_t path_len, uint64_t multikey_idx)
{
int rc;
struct json_lexer lexer;
@@ -463,6 +464,8 @@ tuple_go_to_path(const char **data, const char *path, uint32_t path_len)
json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
while ((rc = json_lexer_next_token(&lexer, &token)) == 0) {
switch (token.type) {
+ case JSON_TOKEN_ANY:
+ token.num = (int)multikey_idx;
case JSON_TOKEN_NUM:
rc = tuple_field_go_to_index(data, token.num);
break;
@@ -532,7 +535,8 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
}
return tuple_field_raw_by_path(format, tuple, field_map, fieldno,
path + lexer.offset,
- path_len - lexer.offset, NULL);
+ path_len - lexer.offset, NULL,
+ INVALID_MULTIKEY_INDEX);
}
/* }}} tuple_field_* getters */
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 8b12fd5a8..e498e1e41 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -45,6 +45,8 @@ struct slab_arena;
struct quota;
struct key_part;
+#define INVALID_MULTIKEY_INDEX UINT64_MAX
+
/**
* A format for standalone tuples allocated on runtime arena.
* \sa tuple_new().
@@ -286,6 +288,7 @@ box_tuple_upsert(const box_tuple_t *tuple, const char *expr, const
/** \endcond public */
+
/**
* An atom of Tarantool storage. Represents MsgPack Array.
* Tuple has the following structure:
@@ -296,7 +299,9 @@ box_tuple_upsert(const box_tuple_t *tuple, const char *expr, const
* | ^
* +---------------------------------------data_offset
*
- * Each 'off_i' is the offset to the i-th indexed field.
+ * Each 'off_i' is the offset to the i-th indexed field or field
+ * map extention (described with struct tuple_field_map_ext)
+ * offset (in sizeof(field_map[0]) units).
*/
struct PACKED tuple
{
@@ -328,6 +333,26 @@ struct PACKED tuple
*/
};
+/**
+ * Tuple field map extent. Is allocated on tuple_field_map_create
+ * call automatically when necessary, when tuple field map must be
+ * reallocated dynamically.
+ */
+struct field_map_ext {
+ /* Sequence of data offset slots. */
+ uint32_t field_map[1];
+ /* The count of field_map items. */
+ uint32_t size;
+};
+
+static inline struct field_map_ext *
+tuple_field_map_ext(uint32_t *field_map, int32_t root_offset_slot)
+{
+ uint32_t slot_off = field_map[root_offset_slot];
+ return (struct field_map_ext *)((char *)field_map -
+ slot_off * sizeof(uint32_t) - sizeof(struct field_map_ext));
+}
+
/** Size of the tuple including size of struct tuple. */
static inline size_t
tuple_size(const struct tuple *tuple)
@@ -508,11 +533,32 @@ tuple_field_count(const struct tuple *tuple)
* with NULL.
* @param path The path to process.
* @param path_len The length of the @path.
+ * @param multikey_index The multikey index hint - index of item
+ * for JSON_TOKEN_ANY level.
* @retval 0 On success.
* @retval -1 In case of error in JSON path.
*/
int
-tuple_go_to_path(const char **data, const char *path, uint32_t path_len);
+tuple_go_to_path_multikey(const char **data, const char *path,
+ uint32_t path_len, uint64_t multikey_idx);
+
+/**
+ * Retrieve msgpack data by JSON path.
+ * @param data[in, out] Pointer to msgpack with data.
+ * If the field cannot be retrieved be the
+ * specified path @path, it is overwritten
+ * with NULL.
+ * @param path The path to process.
+ * @param path_len The length of the @path.
+ * @retval 0 On success.
+ * @retval -1 In case of error in JSON path.
+ */
+static inline int
+tuple_go_to_path(const char **data, const char *path, uint32_t path_len)
+{
+ return tuple_go_to_path_multikey(data, path, path_len,
+ INVALID_MULTIKEY_INDEX);
+}
/**
* Get tuple field by field index and relative JSON path.
@@ -533,7 +579,7 @@ static inline const char *
tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
const uint32_t *field_map, uint32_t fieldno,
const char *path, uint32_t path_len,
- int32_t *offset_slot_hint)
+ int32_t *offset_slot_hint, uint64_t multikey_idx)
{
int32_t offset_slot;
if (offset_slot_hint != NULL &&
@@ -558,10 +604,22 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
if (offset_slot_hint != NULL)
*offset_slot_hint = offset_slot;
offset_slot_access:
- /* Indexed field */
- if (field_map[offset_slot] == 0)
- return NULL;
- tuple += field_map[offset_slot];
+ if (multikey_idx != INVALID_MULTIKEY_INDEX) {
+ struct field_map_ext *field_map_ext =
+ tuple_field_map_ext((uint32_t *)field_map,
+ offset_slot);
+ if (multikey_idx >= field_map_ext->size)
+ return NULL;
+ uint32_t off = field_map_ext->field_map[-multikey_idx];
+ if (off == 0)
+ return NULL;
+ tuple += off;
+ } else {
+ /* Indexed field */
+ if (field_map[offset_slot] == 0)
+ return NULL;
+ tuple += field_map[offset_slot];
+ }
} else {
uint32_t field_count;
parse:
@@ -572,8 +630,8 @@ parse:
for (uint32_t k = 0; k < fieldno; k++)
mp_next(&tuple);
if (path != NULL &&
- unlikely(tuple_go_to_path(&tuple, path,
- path_len) != 0))
+ unlikely(tuple_go_to_path_multikey(&tuple, path, path_len,
+ multikey_idx) != 0))
return NULL;
}
return tuple;
@@ -595,7 +653,7 @@ tuple_field_raw(struct tuple_format *format, const char *tuple,
const uint32_t *field_map, uint32_t field_no)
{
return tuple_field_raw_by_path(format, tuple, field_map, field_no,
- NULL, 0, NULL);
+ NULL, 0, NULL, INVALID_MULTIKEY_INDEX);
}
/**
@@ -634,16 +692,18 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
uint32_t path_len, uint32_t path_hash);
/**
- * Get a tuple field pointed to by an index part.
+ * Get a tuple field pointed to by an index part and multikey hint.
* @param format Tuple format.
* @param tuple A pointer to MessagePack array.
* @param field_map A pointer to the LAST element of field map.
+ * @param multikey_idx A multikey hint.
* @param part Index part to use.
* @retval Field data if the field exists or NULL.
*/
static inline const char *
-tuple_field_raw_by_part(struct tuple_format *format, const char *data,
- const uint32_t *field_map, struct key_part *part)
+tuple_field_raw_by_part_multikey(struct tuple_format *format, const char *data,
+ const uint32_t *field_map,
+ struct key_part *part, uint64_t multikey_idx)
{
if (unlikely(part->format_epoch != format->epoch)) {
assert(format->epoch != 0);
@@ -656,7 +716,41 @@ tuple_field_raw_by_part(struct tuple_format *format, const char *data,
}
return tuple_field_raw_by_path(format, data, field_map, part->fieldno,
part->path, part->path_len,
- &part->offset_slot_cache);
+ &part->offset_slot_cache, multikey_idx);
+}
+
+/**
+ * Get a field refereed by index @part and multikey hint in tuple.
+ * @param tuple Tuple to get the field from.
+ * @param part Index part to use.
+ * @param multikey_idx A multikey hint.
+ * @retval Field data if the field exists or NULL.
+ */
+static inline const char *
+tuple_field_by_part_multikey(const struct tuple *tuple, struct key_part *part,
+ uint64_t multikey_idx)
+{
+ return tuple_field_raw_by_part_multikey(tuple_format(tuple),
+ tuple_data(tuple),
+ tuple_field_map(tuple), part,
+ multikey_idx);
+}
+
+
+/**
+ * Get a tuple field pointed to by an index part.
+ * @param format Tuple format.
+ * @param tuple A pointer to MessagePack array.
+ * @param field_map A pointer to the LAST element of field map.
+ * @param part Index part to use.
+ * @retval Field data if the field exists or NULL.
+ */
+static inline const char *
+tuple_field_raw_by_part(struct tuple_format *format, const char *data,
+ const uint32_t *field_map, struct key_part *part)
+{
+ return tuple_field_raw_by_part_multikey(format, data, field_map,
+ part, INVALID_MULTIKEY_INDEX);
}
/**
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 86922c9bb..cdb08f8fe 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -351,6 +351,8 @@ key_def_set_hint_func(struct key_def *def)
{
def->key_hint = NULL;
def->tuple_hint = NULL;
+ if (def->has_multikey_parts)
+ return;
bool is_nullable = key_part_is_nullable(def->parts);
switch (def->parts->type) {
case FIELD_TYPE_UNSIGNED: {
@@ -835,7 +837,8 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
return i;
}
-template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths,
+ bool is_multikey>
static inline int
tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
uint64_t tuple_a_hint, const struct tuple *tuple_b,
@@ -845,8 +848,10 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
assert(!has_optional_parts || is_nullable);
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
+ assert(is_multikey == key_def->has_multikey_parts);
+ assert(!is_multikey || is_multikey == has_json_paths);
int rc = 0;
- if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
+ if (!is_multikey && (rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
return rc;
struct key_part *part = key_def->parts;
const char *tuple_a_raw = tuple_data(tuple_a);
@@ -889,7 +894,14 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
end = part + key_def->part_count;
for (; part < end; part++) {
- if (has_json_paths) {
+ if (is_multikey) {
+ field_a = tuple_field_raw_by_part_multikey(format_a,
+ tuple_a_raw, field_map_a, part,
+ tuple_a_hint);
+ field_b = tuple_field_raw_by_part_multikey(format_b,
+ tuple_b_raw, field_map_b, part,
+ tuple_b_hint);
+ } else if (has_json_paths) {
field_a = tuple_field_raw_by_part(format_a, tuple_a_raw,
field_map_a, part);
field_b = tuple_field_raw_by_part(format_b, tuple_b_raw,
@@ -946,7 +958,14 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
*/
end = key_def->parts + key_def->part_count;
for (; part < end; ++part) {
- if (has_json_paths) {
+ if (is_multikey) {
+ field_a = tuple_field_raw_by_part_multikey(format_a,
+ tuple_a_raw, field_map_a, part,
+ tuple_a_hint);
+ field_b = tuple_field_raw_by_part_multikey(format_b,
+ tuple_b_raw, field_map_b, part,
+ tuple_b_hint);
+ } else if (has_json_paths) {
field_a = tuple_field_raw_by_part(format_a, tuple_a_raw,
field_map_a, part);
field_b = tuple_field_raw_by_part(format_b, tuple_b_raw,
@@ -976,24 +995,28 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
struct key_def *key_def)
{
return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts,
- has_json_paths>(tuple_a, HINT_INVALID, tuple_b,
+ has_json_paths, false>(tuple_a, HINT_INVALID, tuple_b,
HINT_INVALID, key_def);
}
-template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths,
+ bool is_multikey>
static inline int
tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
uint64_t tuple_hint, const char *key, uint32_t part_count,
uint64_t key_hint, struct key_def *key_def)
{
+ (void)key_hint;
assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
assert(key != NULL || part_count == 0);
assert(part_count <= key_def->part_count);
+ assert(is_multikey == key_def->has_multikey_parts);
+ assert(!is_multikey || is_multikey == has_json_paths);
int rc = 0;
- if ((rc = hint_cmp(tuple_hint, key_hint)) != 0)
+ if (!is_multikey && (rc = hint_cmp(tuple_hint, key_hint)) != 0)
return rc;
struct key_part *part = key_def->parts;
struct tuple_format *format = tuple_format(tuple);
@@ -1002,7 +1025,10 @@ tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
enum mp_type a_type, b_type;
if (likely(part_count == 1)) {
const char *field;
- if (has_json_paths) {
+ if (is_multikey) {
+ field = tuple_field_raw_by_part_multikey(format,
+ tuple_raw, field_map, part, tuple_hint);
+ } else if (has_json_paths) {
field = tuple_field_raw_by_part(format, tuple_raw,
field_map, part);
} else {
@@ -1032,7 +1058,10 @@ tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
struct key_part *end = part + part_count;
for (; part < end; ++part, mp_next(&key)) {
const char *field;
- if (has_json_paths) {
+ if (is_multikey) {
+ field = tuple_field_raw_by_part_multikey(format,
+ tuple_raw, field_map, part, tuple_hint);
+ } else if (has_json_paths) {
field = tuple_field_raw_by_part(format, tuple_raw,
field_map, part);
} else {
@@ -1074,7 +1103,7 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
uint32_t part_count, struct key_def *key_def)
{
return tuple_compare_with_key_slowpath_hinted<is_nullable,
- has_optional_parts, has_json_paths>(tuple,
+ has_optional_parts, has_json_paths, false>(tuple,
HINT_INVALID, key, part_count,
HINT_INVALID, key_def);
}
@@ -1508,14 +1537,22 @@ static const tuple_compare_t compare_slowpath_funcs[] = {
};
static const tuple_compare_hinted_t compare_slowpath_hinted_funcs[] = {
- tuple_compare_slowpath_hinted<false, false, false>,
- tuple_compare_slowpath_hinted<true, false, false>,
- tuple_compare_slowpath_hinted<false, true, false>,
- tuple_compare_slowpath_hinted<true, true, false>,
- tuple_compare_slowpath_hinted<false, false, true>,
- tuple_compare_slowpath_hinted<true, false, true>,
- tuple_compare_slowpath_hinted<false, true, true>,
- tuple_compare_slowpath_hinted<true, true, true>
+ tuple_compare_slowpath_hinted<false, false, false, false>,
+ tuple_compare_slowpath_hinted<true, false, false, false>,
+ tuple_compare_slowpath_hinted<false, true, false, false>,
+ tuple_compare_slowpath_hinted<true, true, false, false>,
+ tuple_compare_slowpath_hinted<false, false, true, false>,
+ tuple_compare_slowpath_hinted<true, false, true, false>,
+ tuple_compare_slowpath_hinted<false, true, true, false>,
+ tuple_compare_slowpath_hinted<true, true, true, false>,
+ tuple_compare_slowpath_hinted<false, false, false, true>,
+ tuple_compare_slowpath_hinted<true, false, false, true>,
+ tuple_compare_slowpath_hinted<false, true, false, true>,
+ tuple_compare_slowpath_hinted<true, true, false, true>,
+ tuple_compare_slowpath_hinted<false, false, true, true>,
+ tuple_compare_slowpath_hinted<true, false, true, true>,
+ tuple_compare_slowpath_hinted<false, true, true, true>,
+ tuple_compare_slowpath_hinted<true, true, true, true>,
};
void
@@ -1523,7 +1560,14 @@ key_def_set_tuple_compare(struct key_def *def)
{
int cmp_func_idx = (def->is_nullable ? 1 : 0) +
2 * (def->has_optional_parts ? 1 : 0) +
- 4 * (def->has_json_paths ? 1 : 0);
+ 4 * (def->has_json_paths ? 1 : 0) +
+ 8 * (def->has_multikey_parts ? 1 : 0);
+ if (def->has_multikey_parts) {
+ def->tuple_compare = NULL;
+ def->tuple_compare_hinted =
+ compare_slowpath_hinted_funcs[cmp_func_idx];
+ return;
+ }
if (def->is_nullable) {
if (key_def_is_sequential(def)) {
if (def->has_optional_parts) {
@@ -1795,14 +1839,22 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
static const tuple_compare_with_key_hinted_t
compare_with_key_slowpath_hinted_funcs[] = {
- tuple_compare_with_key_slowpath_hinted<false, false, false>,
- tuple_compare_with_key_slowpath_hinted<true, false, false>,
- tuple_compare_with_key_slowpath_hinted<false, true, false>,
- tuple_compare_with_key_slowpath_hinted<true, true, false>,
- tuple_compare_with_key_slowpath_hinted<false, false, true>,
- tuple_compare_with_key_slowpath_hinted<true, false, true>,
- tuple_compare_with_key_slowpath_hinted<false, true, true>,
- tuple_compare_with_key_slowpath_hinted<true, true, true>
+ tuple_compare_with_key_slowpath_hinted<false, false, false, false>,
+ tuple_compare_with_key_slowpath_hinted<true, false, false, false>,
+ tuple_compare_with_key_slowpath_hinted<false, true, false, false>,
+ tuple_compare_with_key_slowpath_hinted<true, true, false, false>,
+ tuple_compare_with_key_slowpath_hinted<false, false, true, false>,
+ tuple_compare_with_key_slowpath_hinted<true, false, true, false>,
+ tuple_compare_with_key_slowpath_hinted<false, true, true, false>,
+ tuple_compare_with_key_slowpath_hinted<true, true, true, false>,
+ tuple_compare_with_key_slowpath_hinted<false, false, false, true>,
+ tuple_compare_with_key_slowpath_hinted<true, false, false, true>,
+ tuple_compare_with_key_slowpath_hinted<false, true, false, true>,
+ tuple_compare_with_key_slowpath_hinted<true, true, false, true>,
+ tuple_compare_with_key_slowpath_hinted<false, false, true, true>,
+ tuple_compare_with_key_slowpath_hinted<true, false, true, true>,
+ tuple_compare_with_key_slowpath_hinted<false, true, true, true>,
+ tuple_compare_with_key_slowpath_hinted<true, true, true, true>
};
void
@@ -1810,7 +1862,14 @@ key_def_set_tuple_compare_with_key(struct key_def *def)
{
int cmp_func_idx = (def->is_nullable ? 1 : 0) +
2 * (def->has_optional_parts ? 1 : 0) +
- 4 * (def->has_json_paths ? 1 : 0);
+ 4 * (def->has_json_paths ? 1 : 0) +
+ 8 * (def->has_multikey_parts ? 1 : 0);
+ if (def->has_multikey_parts) {
+ def->tuple_compare_with_key = NULL;
+ def->tuple_compare_with_key_hinted =
+ compare_with_key_slowpath_hinted_funcs[cmp_func_idx];
+ return;
+ }
if (def->is_nullable) {
if (key_def_is_sequential(def)) {
if (def->has_optional_parts) {
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 4439ce3e0..99b602cd5 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -33,6 +33,7 @@
#include "json/json.h"
#include "tuple_format.h"
#include "coll_id_cache.h"
+#include "tuple.h"
#include "third_party/PMurHash.h"
@@ -233,14 +234,19 @@ tuple_format_add_field(struct tuple_format *format, uint32_t fieldno,
json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 &&
field->token.type != JSON_TOKEN_END) {
- if (field->token.type == JSON_TOKEN_ANY) {
- diag_set(ClientError, ER_UNSUPPORTED,
- "Tarantool", "multikey indexes");
- goto fail;
- }
enum field_type expected_type =
field->token.type == JSON_TOKEN_STR ?
FIELD_TYPE_MAP : FIELD_TYPE_ARRAY;
+ if (field->token.type == JSON_TOKEN_ANY &&
+ !json_token_is_multikey(&parent->token) &&
+ !json_token_is_leaf(&parent->token)) {
+ assert(expected_type == FIELD_TYPE_ARRAY);
+ diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
+ tuple_field_path(parent),
+ field_type_strs[parent->type],
+ "multikey array");
+ goto fail;
+ }
if (field_type1_contains_type2(parent->type, expected_type)) {
parent->type = expected_type;
} else {
@@ -475,9 +481,8 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
break;
format->min_tuple_size += mp_sizeof_nil();
}
- } else {
+ } else if (field->token.type == JSON_TOKEN_STR) {
/* Account a key string for map member. */
- assert(field->token.type == JSON_TOKEN_STR);
format->min_tuple_size +=
mp_sizeof_str(field->token.len);
}
@@ -805,6 +810,59 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1,
return true;
}
+/**
+ * Grow tuple_field_map allocation left with extent_size
+ * 0 bytes.
+ */
+static int
+tuple_field_map_realloc(uint32_t **field_map, uint32_t *field_map_size,
+ uint32_t extent_size, struct region *region)
+{
+ assert(extent_size % sizeof(uint32_t) == 0);
+ uint32_t new_field_map_size = *field_map_size + extent_size;
+ uint32_t *new_field_map = region_alloc(region, new_field_map_size);
+ if (new_field_map == NULL) {
+ diag_set(OutOfMemory, new_field_map_size, "region_alloc",
+ "new_field_map");
+ return -1;
+ }
+ memset(new_field_map, 0, extent_size);
+ if (*field_map != NULL) {
+ memcpy((char *)new_field_map + extent_size,
+ (char *)*field_map - *field_map_size, *field_map_size);
+ }
+ *field_map = (uint32_t *)((char *)new_field_map + new_field_map_size);
+ *field_map_size = new_field_map_size;
+ return 0;
+}
+
+/**
+ * Retrieve tuple field map extent by root_offset_slot, reallocate
+ * memory if required.
+ */
+struct field_map_ext *
+tuple_field_map_ext_retrieve(uint32_t **field_map, uint32_t *field_map_size,
+ int32_t root_offset_slot, uint32_t extent_items,
+ struct region *region)
+{
+ uint32_t extent_sz = sizeof(struct field_map_ext) +
+ extent_items * sizeof(uint32_t);
+ uint32_t slot_off = (*field_map)[root_offset_slot];
+ if (slot_off == 0) {
+ /* Field map extent was not created yet. */
+ slot_off = *field_map_size / sizeof(uint32_t);
+ (*field_map)[root_offset_slot] = slot_off;
+ if (tuple_field_map_realloc(field_map, field_map_size,
+ extent_sz, region) != 0)
+ return NULL;
+ }
+ struct field_map_ext *field_map_ext =
+ tuple_field_map_ext(*field_map, root_offset_slot);
+ assert(field_map_ext->size == 0 || field_map_ext->size == extent_items);
+ field_map_ext->size = extent_items;
+ return field_map_ext;
+}
+
/** @sa declaration for details. */
int
tuple_field_map_create(struct tuple_format *format, const char *tuple,
@@ -817,14 +875,11 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
return 0; /* Nothing to initialize */
}
struct region *region = &fiber()->gc;
- *field_map_size = format->field_map_size;
- *field_map = region_alloc(region, *field_map_size);
- if (*field_map == NULL) {
- diag_set(OutOfMemory, *field_map_size, "region_alloc",
- "field_map");
+ *field_map = NULL;
+ *field_map_size = 0;
+ if (tuple_field_map_realloc(field_map, field_map_size,
+ format->field_map_size, region) != 0)
return -1;
- }
- *field_map = (uint32_t *)((char *)*field_map + *field_map_size);
const char *pos = tuple;
int rc = 0;
@@ -864,11 +919,6 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
uint32_t defined_field_count = MIN(field_count, validate ?
tuple_format_field_count(format) :
format->index_field_count);
- /*
- * Nullify field map to be able to detect by 0,
- * which key fields are absent in tuple_field().
- */
- memset((char *)*field_map - *field_map_size, 0, *field_map_size);
/*
* Prepare mp stack of the size equal to the maximum depth
* of the indexed field in the format::fields tree
@@ -885,6 +935,12 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
struct mp_stack stack;
mp_stack_create(&stack, format->fields_depth, frames);
mp_stack_push(&stack, MP_ARRAY, defined_field_count);
+ /**
+ * Pointer to the stack entry that represents the multikey
+ * index root ARRAY entry.
+ */
+ struct mp_frame *mk_parent_frame = NULL;
+
struct tuple_field *field;
struct json_token *parent = &format->fields.root;
while (true) {
@@ -901,6 +957,10 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
mp_stack_pop(&stack);
if (mp_stack_is_empty(&stack))
goto finish;
+ if (json_token_is_multikey(parent)) {
+ /* Leave multikey index branch. */
+ mk_parent_frame = NULL;
+ }
parent = parent->parent;
}
/*
@@ -950,8 +1010,23 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
field_type_strs[field->type]);
goto error;
}
- if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+ if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL &&
+ mk_parent_frame == NULL) {
(*field_map)[field->offset_slot] = pos - tuple;
+ } else if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL &&
+ mk_parent_frame != NULL) {
+ uint32_t extent_items = mk_parent_frame->count;
+ struct field_map_ext *field_map_ext =
+ tuple_field_map_ext_retrieve(field_map,
+ field_map_size,
+ field->offset_slot,
+ extent_items, region);
+ if (field_map_ext == NULL)
+ goto error;
+ int multikey_idx = mk_parent_frame->curr;
+ field_map_ext->field_map[
+ -multikey_idx] = pos - tuple;
+ }
if (required_fields != NULL)
bit_clear(required_fields, field->id);
}
@@ -968,6 +1043,11 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
mp_decode_array(&pos) :
mp_decode_map(&pos);
mp_stack_push(&stack, type, size);
+ if (json_token_is_multikey(&field->token)) {
+ /* Track multikey entry frame. */
+ assert(type == MP_ARRAY);
+ mk_parent_frame = &frames[stack.used - 1];
+ }
parent = &field->token;
} else {
mp_next(&pos);
diff --git a/test/engine/json.result b/test/engine/json.result
index a790cdfbc..5c7a9b3b3 100644
--- a/test/engine/json.result
+++ b/test/engine/json.result
@@ -691,7 +691,85 @@ s = box.schema.space.create('withdata')
...
idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}})
---
-- error: Tarantool does not support multikey indexes
+...
+s:insert({1, 1, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}})
+---
+- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({1, 1, {{fname='Ivan', sname='Ivanych'}}})
+---
+- [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+...
+idx:select({'James', 'Bond'})
+---
+- - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+idx:select({'Kirill', 'Shcherbatov'})
+---
+- []
+...
+idx:select({'Ivan', 'Ivanych'})
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+...
+idx:select({'Vasya', 'Pupkin'})
+---
+- - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+idx:select()
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+ - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+ - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+---
+- error: Duplicate key exists in unique index 'idx' in space 'withdata'
+...
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+---
+- error: Duplicate key exists in unique index 'idx' in space 'withdata'
+...
+idx:select()
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+ - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+ - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+idx:delete({'Vasya', 'Pupkin'})
+---
+- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+---
+- [2, 1, [{'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+---
+- [2, 1, [{'fname': 'James', 'sname': 'Bond'}]]
+...
+idx:select()
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+ - [2, 1, [{'fname': 'James', 'sname': 'Bond'}]]
+ - [2, 1, [{'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:drop()
+---
+...
+s = box.schema.space.create('withdata')
+---
+...
+pk = s:create_index('pk')
+---
+...
+idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}})
+---
+...
+s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}})
+---
+- [1, 1, [{'fname': 'A1', 'extra': {'sname': 'A2'}}, {'fname': 'B1'}, {'fname': 'C1',
+ 'extra': {'sname': 'C2'}}]]
...
s:drop()
---
diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
index f9a7180b1..dc6016af9 100644
--- a/test/engine/json.test.lua
+++ b/test/engine/json.test.lua
@@ -198,4 +198,24 @@ s:drop()
--
s = box.schema.space.create('withdata')
idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}})
+s:insert({1, 1, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}})
+s:insert({1, 1, {{fname='Ivan', sname='Ivanych'}}})
+idx:select({'James', 'Bond'})
+idx:select({'Kirill', 'Shcherbatov'})
+idx:select({'Ivan', 'Ivanych'})
+idx:select({'Vasya', 'Pupkin'})
+idx:select()
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+idx:select()
+idx:delete({'Vasya', 'Pupkin'})
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+idx:select()
+s:drop()
+
+s = box.schema.space.create('withdata')
+pk = s:create_index('pk')
+idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}})
+s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}})
s:drop()
--
2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
@ 2019-03-14 7:04 ` Vladimir Davydov
2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov
0 siblings, 1 reply; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-14 7:04 UTC (permalink / raw)
To: Kirill Shcherbatov; +Cc: tarantool-patches
On Wed, Mar 13, 2019 at 03:15:37PM +0300, Kirill Shcherbatov wrote:
> @@ -1326,6 +1340,6 @@ tuple_compare_with_key_create(const struct key_def *def)
> void
> key_def_set_compare_func(struct key_def *def)
> {
> - def->tuple_compare = tuple_compare_create(def);
> - def->tuple_compare_with_key = tuple_compare_with_key_create(def);
> + key_def_set_tuple_compare(def);
> + key_def_set_tuple_compare_with_key(def);
> }
The two functions share a lot of code - all branching is basically the
same. I think we better merge them instead.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v6 2/3] memtx: introduce tuple compare hint
2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
@ 2019-03-14 8:19 ` Vladimir Davydov
2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov
0 siblings, 1 reply; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-14 8:19 UTC (permalink / raw)
To: Kirill Shcherbatov; +Cc: tarantool-patches
On Wed, Mar 13, 2019 at 03:15:38PM +0300, Kirill Shcherbatov wrote:
> Implement functions for retrieving tuple hints for a particular
> key_def. Hint is an integer that can be used for tuple comparison
> optimization: if a hint of one tuple is less than a hint of another
> then the first tuple is definitely less than the second; only if
> hints are equal tuple_compare must be called for getting comparison
> result.
>
> Hints are calculated using only the first part of key_def.
>
> Hints are only useful when:
> * they are precalculated and stored along with the tuple;
> calculation is not cheap (cheaper than tuple_compare btw) but
> precalculated hints allow to compare tuples without even fetching
> tuple data.
> * first part of key_def is 'string'(for now)
> * since hint is calculated using only the first part of key_def
> (and only first several characters if it is a string) this part
> must be different with high probability for every tuple pair.
Please refresh the comment. It isn't quite correct.
>
> Closes #3961
> ---
> src/box/key_def.h | 95 +++++++
> src/box/memtx_tree.c | 46 ++-
> src/box/tuple_compare.cc | 584 +++++++++++++++++++++++++++++++++++++--
> src/lib/coll/coll.c | 33 +++
> src/lib/coll/coll.h | 4 +
> 5 files changed, 730 insertions(+), 32 deletions(-)
Please add tests checking that changing an indexed field type from
scalar to a basic type and vice versa without rebuilding the index
doesn't break ordering.
>
> diff --git a/src/box/key_def.h b/src/box/key_def.h
> index dd62f6a35..0d8f3112e 100644
> --- a/src/box/key_def.h
> +++ b/src/box/key_def.h
> @@ -137,6 +137,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
> typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
> const struct tuple *tuple_b,
> struct key_def *key_def);
> +/** @copydoc tuple_compare_with_key_hinted() */
> +typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
> + uint64_t tuple_hint,
> + const char *key,
> + uint32_t part_count,
> + uint64_t key_hint,
> + struct key_def *key_def);
On the second thought, we might want to reduce the hint size or increase
it in future. Would it be better to introduce a helper type for hints so
that we could easily do this if we need to?
typedef uint64_t tuple_hint_t;
What do you think? Would it be worthwhile? Would it look OK?
Sorry, I asked you to use uint64_t before. Looks to me now that
I was wrong... :-(
> @@ -571,6 +597,51 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
> return key_def->tuple_compare_with_key(tuple, key, part_count, key_def);
> }
>
> +/**
> + * Compare tuples using the key definition and comparison hints.
> + * @param tuple_a First tuple.
> + * @param tuple_a_hint Comparison hint is calculated for the
> + * @a tuple_a.
Should be "Comparison hint of @a tuple_a"
Please fix here and everywhere else where appropriate.
> + * @param tuple_b Second tuple.
> + * @param tuple_b_hint Comparison hint is calculated for the
> + * @a tuple_b.
> + * @param key_def Key definition.
> + * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b)
> + * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b)
> + * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b)
> + */
> +static inline int
> +tuple_compare_hinted(const struct tuple *tuple_a, uint64_t tuple_a_hint,
> + const struct tuple *tuple_b, uint64_t tuple_b_hint,
> + struct key_def *key_def)
> +{
> + return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
> + tuple_b_hint, key_def);
> +}
> +
> +/**
> + * Compare tuple with key using the key definition and
> + * comparison hints.
> + * @param tuple tuple
> + * @param tuple_hint Comparison hint is calculated for @a tuple.
> + * @param key key parts without MessagePack array header
> + * @param part_count the number of parts in @a key
This is nitpicking, of course, but please be consistent. If you put dot
(.) at the end of a sentence, put it everywhere.
> + * @param key_hint t Comparison key kent is calculated for @a key.
Comparison hint of @a key.
> + * @param key_def key definition
> + * @retval 0 if key_fields(tuple) == parts(key)
> + * @retval <0 if key_fields(tuple) < parts(key)
> + * @retval >0 if key_fields(tuple) > parts(key)
> + */
> +static inline int
> +tuple_compare_with_key_hinted(const struct tuple *tuple, uint64_t tuple_hint,
> + const char *key, uint32_t part_count,
> + uint64_t key_hint, struct key_def *key_def)
> +{
> + return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
> + part_count, key_hint,
> + key_def);
> +}
> +
> /**
> * Compute hash of a tuple field.
> * @param ph1 - pointer to running hash
> @@ -624,6 +695,30 @@ key_hash(const char *key, struct key_def *key_def)
> return key_def->key_hash(key, key_def);
> }
>
> + /*
> + * Get a comparison hint for a @a tuple.
The article before 'tuple' ('a') is redundant.
> + * @param tuple - tuple to get uint64_t of.
tuple to compute the hint for.
> + * @param key_def - key_def that defines which comparison is used.
key definition used for tuple comparison.
> + * @return the comparison auxiliary information.
the hint.
> + */
> +static inline uint64_t
> +tuple_hint(const struct tuple *tuple, struct key_def *key_def)
> +{
> + return key_def->tuple_hint(tuple, key_def);
> +}
> +
> +/**
> + * Get a comparison hint of a @a key.
> + * @param key - key to get hint of.
> + * @param key_def - key_def that defines which comparison is used.
> + * @return the comparison auxiliary information.
Ditto.
> + */
> +static inline uint64_t
> +key_hint(const char *key, struct key_def *key_def)
> +{
> + return key_def->key_hint(key, key_def);
> +}
> +
> #if defined(__cplusplus)
> } /* extern "C" */
> #endif /* defined(__cplusplus) */
> diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
> index d5b42efb6..2c1933ceb 100644
> --- a/src/box/memtx_tree.c
> +++ b/src/box/memtx_tree.c
> @@ -47,6 +47,11 @@ struct memtx_tree_key_data {
> const char *key;
> /** Number of msgpacked search fields. */
> uint32_t part_count;
> + /**
> + * Comparison hint is calculated with
> + * key_hint(key, ...).
> + */
/** Comparison hint, see tuple_hint(). */
> + uint64_t hint;
> };
>
> /**
> @@ -55,6 +60,11 @@ struct memtx_tree_key_data {
> struct memtx_tree_data {
> /* Tuple that this node is represents. */
> struct tuple *tuple;
> + /**
> + * Comparison hint is calculated with
> + * tuple_hint(tuple, ...).
> + */
/** Comparison hint, see key_hint(). */
> + uint64_t hint;
> };
>
> /**
> @@ -69,16 +79,18 @@ static bool
> memtx_tree_data_identical(const struct memtx_tree_data *a,
> const struct memtx_tree_data *b)
> {
> - return a->tuple == b->tuple;
> + return a->tuple == b->tuple && a->hint == b->hint;
> }
>
> #define BPS_TREE_NAME memtx_tree
> #define BPS_TREE_BLOCK_SIZE (512)
> #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
> #define BPS_TREE_COMPARE(a, b, arg)\
> - tuple_compare((&a)->tuple, (&b)->tuple, arg)
> + tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
> + (&b)->hint, arg)
Hmm, (&a)->tuple is equivalent to (a).tuple, isn't it?
> @@ -516,9 +533,11 @@ memtx_tree_index_get(struct index *base, const char *key,
> assert(base->def->opts.is_unique &&
> part_count == base->def->key_def->part_count);
> struct memtx_tree_index *index = (struct memtx_tree_index *)base;
> + struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
> struct memtx_tree_key_data key_data;
> key_data.key = key;
> key_data.part_count = part_count;
> + key_data.hint = key_hint(key, cmp_def);
> struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
> *result = res != NULL ? res->tuple : NULL;
> return 0;
> @@ -530,9 +549,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
> struct tuple **result)
> {
> struct memtx_tree_index *index = (struct memtx_tree_index *)base;
> + struct key_def *key_def = index->tree.arg;
Accessing index->tree.arg looks ugly - it violates encapsulation.
Besides, right above you use index_def_cmp_def instead. I think we
better introduce a helper function for accessing index->tree.arg
and use it everywhere:
memtx_tree_cmp_def(&index->tree)
> diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
> index 1bcff2ca3..86922c9bb 100644
> --- a/src/box/tuple_compare.cc
> +++ b/src/box/tuple_compare.cc
> @@ -34,6 +34,383 @@
> #include "trivia/util.h" /* NOINLINE */
> #include <math.h>
>
> +/**
> + * Tuple hints
If you agree to add tuple_hint_t, then please move this comment to
the type declaration. Don't forget to update it when you introduce
multikey indexes.
> + *
> + * Hint is a value h(t) is calculated for tuple in terms
> + * of particular key_def that has the following rules:
Tuple comparison hint h(t) is such a function of tuple t that
the following conditions always hold for any pair of tuples
t1 and t2:
> + * if h(t1) < h(t2) then t1 < t2;
> + * if h(t1) > h(t2) then t1 > t2;
> + * if t1 == t2 then regular comparision is required;
if h(t1) == h(t2) then t1 may or may not be equal to t2.
> + * These rules mean that instead of direct tuple vs tuple
> + * (or tuple vs key) comparison one may compare theirs
> + * hints first; and only if theirs hints equal compare
> + * the tuples themselves.
> + *
> + * The comparison hint has the following structure:
> + * uint64_t: [ CMP_HINT_TYPE | CMP_HINT_VALUE ]
> + * 64 62 0 bit
> + *
> + * The reserved value HINT_INVALID is returned when hint is
> + * undefined.
> + */
> +#define HINT_MAX (((int64_t)1 << 61) - 1)
> +#define HINT_MIN (-HINT_MAX - 1)
> +#define HINT_UMAX (((uint64_t)1 << 62) - 1)
What's HINT_UMAX? Please remove if possible.
> +
> +#define HINT_INVALID UINT64_MAX
HINT_NONE would be a better name IMO, because it would stress on
the fact that the hint is absent.
> +
> +#define HINT_TYPE_NUMBER ((uint8_t)1)
> +#define HINT_TYPE_STRING ((uint8_t)2)
> +#define HINT_TYPE_BOOL ((uint8_t)3)
> +
> +#define HINT_TYPE(hint) ((uint64_t)hint >> 62)
> +#define HINT_VALUE(hint) ((uint64_t)hint & (((uint64_t)1 << 62) - 1))
Using 62 directly is error-prone (in case we decide to change it).
Please define constants for all numbers you use.
> +#define HINT(type, value) ({ \
> + assert(HINT_TYPE(value) == 0); \
> + (((uint64_t)type << 62) | (uint64_t)value);})
Would be better to make these helpers static inline functions.
> +
> +/**
> + * Perform hints comparison.
> + * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a;
> + * If the hints are incompatible or equal, a value of 0 is
> + * returned, and the further comparison of the original data is
> + * required.
> + */
> +static int
> +hint_cmp(uint64_t hint_a, uint64_t hint_b)
> +{
> + if (hint_a != HINT_INVALID && hint_b != HINT_INVALID &&
> + HINT_TYPE(hint_a) == HINT_TYPE(hint_b) &&
> + HINT_VALUE(hint_a) != HINT_VALUE(hint_b))
> + return hint_a < hint_b ? -1 : 1;
Hints must be defined in such a way that we don't need to compare types,
i.e. hint(123) < hint("abc").
I suggest you to take a look at mp_class. May be, we could reuse it for
hint types?
> + else
> + return 0;
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_uint(const char *key, struct key_def *key_def)
Please use unsigned/integer for these functions so as not to confuse
them with MP_INT/UINT.
> +{
> + (void)key_def;
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
> + uint64_t ret;
> + if ((is_optional && key == NULL) ||
> + (is_nullable && mp_typeof(*key) == MP_NIL))
> + return HINT(HINT_TYPE_NUMBER, 0);
> + assert(mp_typeof(*key) == MP_UINT);
> + uint64_t val = mp_decode_uint(&key);
> + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
> + return HINT(HINT_TYPE_NUMBER, ret);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_uint(const struct tuple *tuple, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
> + const char *field = tuple_field_by_part(tuple, key_def->parts);
> + if (is_nullable && field == NULL)
> + return HINT(HINT_TYPE_NUMBER, 0);
> + return key_hint_uint<false, is_nullable>(field, key_def);
> +}
Although I didn't complain about the way you define hint functions
before, I've always had an inkling that something is wrong, that calling
key_hint from tuple_hint looks weird. The 'is_optional' template
argument you added recently has only strengthened my suspicions: it
looks confusing, because in fact it doesn't have anything to do with
optional parts we use in tuple_compare templates.
I propose the following refactoring. Let's introduce field_hint_*
functions (not using templates) that would take only binary data
of a particular type and return a hint, e.g.
tuple_hint_t
field_hint_unsigned(const char *field) {
assert(mp_typeof(*field) == MP_UINT);
...
}
Then add two parametrised functions for computing a hint for a key and
a tuple that would look like this:
template<enum field_type type, ...>
tuple_hint(const struct tuple *tuple, struct key_def *key_def) {
switch (type) {
case FIELD_TYPE_UNSIGNED:
return field_hint_unsigned(...);
case FIELD_TYPE_INTEGER:
return field_hint_integer(...);
...
}
}
template<enum field_type type, ...>
key_hint(...) {
switch (type) {
case FIELD_TYPE_UNSIGNED:
return field_hint_unsigned(...);
case FIELD_TYPE_INTEGER:
return field_hint_integer(...);
...
}
}
This way all the logic computing hints would be consolidated in
field_hint_* family of functions which would be simple and wouldn't
even take key_def. This would also allow you to get rid of that ugly
switch-case in key_def_set_hint_func.
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_int(const char *key, struct key_def *key_def)
> +{
> + (void)key_def;
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_INTEGER);
> + uint64_t ret;
> + if ((is_optional && key == NULL) ||
> + (is_nullable && mp_typeof(*key) == MP_NIL))
> + return HINT(HINT_TYPE_NUMBER, 0);
> + if (mp_typeof(*key) == MP_UINT) {
> + uint64_t val = mp_decode_uint(&key);
> + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
> + } else {
> + assert(mp_typeof(*key) == MP_INT);
> + int64_t val = mp_decode_int(&key);
> + ret = val > HINT_MAX ? HINT_UMAX :
> + val < HINT_MIN ? 0 :
> + (uint64_t)val - (uint64_t)HINT_MIN;
> + }
> + return HINT(HINT_TYPE_NUMBER, ret);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_int(const struct tuple *tuple, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_INTEGER);
> + const char *field = tuple_field_by_part(tuple, key_def->parts);
> + if (is_nullable && field == NULL)
> + return HINT(HINT_TYPE_NUMBER, 0);
> + return key_hint_int<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_number(const char *key, struct key_def *key_def)
> +{
> + (void)key_def;
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_NUMBER ||
> + key_def->parts->type == FIELD_TYPE_SCALAR);
> + if ((is_optional && key == NULL) ||
> + (is_nullable && mp_typeof(*key) == MP_NIL))
> + return HINT(HINT_TYPE_NUMBER, 0);
> + uint64_t ret;
> + switch (mp_typeof(*key)) {
> + case MP_FLOAT:
> + case MP_DOUBLE: {
> + double f = mp_typeof(*key) == MP_FLOAT ?
> + mp_decode_float(&key) : mp_decode_double(&key);
> + if (!isfinite(f))
> + return HINT_INVALID;
> + double val;
> + (void)modf(f, &val);
> + ret = val > (double)HINT_MAX ? HINT_UMAX :
> + val < (double)HINT_MIN ? 0 :
As I told you before, comparing doulbe with int64_t like this is
incorrect:
#include <stdio.h>
#include <stdint.h>
int main()
{
int64_t x = (1LL << 61) - 1;
double y = x;
printf("x is %lld (int64_t)\n", x);
printf("y is %lf (double)\n", y);
printf("y <= (double)x is %s\n",
y <= (double)x ? "true" : "false");
printf("(int64_t)y is %lld\n", (int64_t)y);
}
prints
x is 2305843009213693951 (int64_t)
y is 2305843009213693952.000000 (double)
y <= (double)x is true
(int64_t)y is 2305843009213693952
Please fix and don't forget to add a test case.
> + (uint64_t)val - (uint64_t)HINT_MIN;
> + break;
> + }
> + case MP_INT: {
> + int64_t val = (uint64_t)mp_decode_int(&key);
> + ret = val > HINT_MAX ? HINT_UMAX :
> + val < HINT_MIN ? 0 :
> + (uint64_t)val - (uint64_t)HINT_MIN;
> + break;
> + }
> + case MP_UINT: {
> + uint64_t val = mp_decode_uint(&key);
> + ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
> + break;
> + }
> + default:
> + unreachable();
> + }
> + return HINT(HINT_TYPE_NUMBER, ret);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_number(const struct tuple *tuple, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_NUMBER);
> + const char *field = tuple_field_by_part(tuple, key_def->parts);
> + if (is_nullable && field == NULL)
> + return HINT(HINT_TYPE_NUMBER, 0);
> + return key_hint_number<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_boolean(const char *key, struct key_def *key_def)
> +{
> + (void)key_def;
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_BOOLEAN ||
> + key_def->parts->type == FIELD_TYPE_SCALAR);
> + if ((is_optional && key == NULL) ||
> + (is_nullable && mp_typeof(*key) == MP_NIL))
> + return HINT(HINT_TYPE_BOOL, 0);
> + return (uint64_t)mp_decode_bool(&key);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_boolean(const struct tuple *tuple, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_BOOLEAN);
> + const char *field = tuple_field_by_part(tuple, key_def->parts);
> + if (is_nullable && field == NULL)
> + return HINT(HINT_TYPE_BOOL, 0);
> + return key_hint_boolean<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_string(const char *key, struct key_def *key_def)
> +{
> + (void)key_def;
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->coll == NULL);
> + assert(key_def->parts->type == FIELD_TYPE_STRING ||
> + key_def->parts->type == FIELD_TYPE_SCALAR);
> + if ((is_optional && key == NULL) ||
> + (is_nullable && mp_typeof(*key) == MP_NIL))
> + return HINT(HINT_TYPE_STRING, 0);
> + assert(mp_typeof(*key) == MP_STR);
> + uint32_t len;
> + const unsigned char *str =
> + (const unsigned char *)mp_decode_str(&key, &len);
> + uint64_t result = 0;
> + uint32_t process_len = MIN(len, 7);
> + for (uint32_t i = 0; i < process_len; i++) {
> + result <<= 8;
> + result |= str[i];
> + }
> + result <<= 8 * (7 - process_len);
Please define an appropriate constant for 7.
> + return HINT(HINT_TYPE_STRING, result);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_string(const struct tuple *tuple, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->coll == NULL);
> + assert(key_def->parts->type == FIELD_TYPE_STRING);
> + const char *field = tuple_field_by_part(tuple, key_def->parts);
> + if (is_nullable && field == NULL)
> + return HINT(HINT_TYPE_STRING, 0);
> + return key_hint_string<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_string_coll(const char *key, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert((key_def->parts->type == FIELD_TYPE_STRING ||
> + key_def->parts->type == FIELD_TYPE_SCALAR) &&
> + key_def->parts->coll != NULL);
> + if ((is_optional && key == NULL) ||
> + (is_nullable && mp_typeof(*key) == MP_NIL))
> + return HINT(HINT_TYPE_STRING, 0);
> + assert(mp_typeof(*key) == MP_STR);
> + uint32_t len;
> + const char *str = mp_decode_str(&key, &len);
> + return key_def->parts->coll->hint(str, len, key_def->parts->coll);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_string_coll(const struct tuple *tuple, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_STRING &&
> + key_def->parts->coll != NULL);
> + const char *field = tuple_field_by_part(tuple, key_def->parts);
> + if (is_nullable && field == NULL)
> + return HINT(HINT_TYPE_STRING, 0);
> + return key_hint_string_coll<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_scalar(const char *key, struct key_def *key_def)
> +{
> + (void)key_def;
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_SCALAR);
> + if ((is_optional && key == NULL) ||
> + (is_nullable && mp_typeof(*key) == MP_NIL))
> + return HINT(HINT_TYPE_BOOL, 0);
> + switch (mp_typeof(*key)) {
> + case MP_INT:
> + case MP_UINT:
> + case MP_FLOAT:
> + case MP_DOUBLE:
> + return key_hint_number<is_optional, is_nullable>(key, key_def);
> + case MP_BOOL:
> + return key_hint_boolean<is_optional, is_nullable>(key, key_def);
> + case MP_STR:
> + if (key_def->parts->coll == NULL)
> + return key_hint_string<is_optional,
> + is_nullable>(key, key_def);
> + else
> + return key_hint_string_coll<is_optional,
> + is_nullable>(key, key_def);
Please make sure this if() is resolved at compile time. Since we've
chosen the path of using templates for optimization, we should stick
to it till the end.
> + default:
> + return HINT_INVALID;
Should be unreachable(). BTW you forgot MP_BIN.
> + }
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_scalar(const struct tuple *tuple, struct key_def *key_def)
> +{
> + assert(key_part_is_nullable(key_def->parts) == is_nullable);
> + assert(key_def->parts->type == FIELD_TYPE_SCALAR);
> + const char *field = tuple_field_by_part(tuple, key_def->parts);
> + if (is_nullable && field == NULL)
> + return HINT(HINT_TYPE_BOOL, 0);
> + return key_hint_scalar<false, is_nullable>(field, key_def);
> +}
> +
> +void
> +key_def_set_hint_func(struct key_def *def)
> +{
> + def->key_hint = NULL;
> + def->tuple_hint = NULL;
> + bool is_nullable = key_part_is_nullable(def->parts);
> + switch (def->parts->type) {
> + case FIELD_TYPE_UNSIGNED: {
> + def->key_hint = is_nullable ? key_hint_uint<true, true> :
> + key_hint_uint<true, false>;
> + def->tuple_hint = is_nullable ? tuple_hint_uint<true> :
> + tuple_hint_uint<false>;
> + break;
> + }
> + case FIELD_TYPE_INTEGER: {
> + def->key_hint = is_nullable ? key_hint_int<true, true> :
> + key_hint_int<true, false>;
> + def->tuple_hint = is_nullable ? tuple_hint_int<true> :
> + tuple_hint_int<false>;
> + break;
> + }
> + case FIELD_TYPE_STRING: {
> + if (def->parts->coll != NULL) {
> + def->key_hint =
> + is_nullable ? key_hint_string_coll<true, true> :
> + key_hint_string_coll<true, false>;
> + def->tuple_hint =
> + is_nullable ? tuple_hint_string_coll<true> :
> + tuple_hint_string_coll<false>;
> + } else {
> + def->key_hint =
> + is_nullable ? key_hint_string<true, true> :
> + key_hint_string<true, false>;
> + def->tuple_hint = is_nullable ? tuple_hint_string<true> :
> + tuple_hint_string<false>;
> + }
> + break;
> + }
> + case FIELD_TYPE_NUMBER: {
> + def->key_hint = is_nullable ? key_hint_number<true, true> :
> + key_hint_number<true, false>;
> + def->tuple_hint = is_nullable ? tuple_hint_number<true> :
> + tuple_hint_number<false>;
> + break;
> + }
> + case FIELD_TYPE_BOOLEAN: {
> + def->key_hint = is_nullable ? key_hint_boolean<true, true> :
> + key_hint_boolean<true, false>;
> + def->tuple_hint = is_nullable ? tuple_hint_boolean<true> :
> + tuple_hint_boolean<false>;
> + break;
> + }
> + case FIELD_TYPE_SCALAR: {
> + def->key_hint = is_nullable ? key_hint_scalar<true, true> :
> + key_hint_scalar<true, false>;
> + def->tuple_hint = is_nullable ? tuple_hint_scalar<true> :
> + tuple_hint_scalar<false>;
> + break;
> + }
> + default: {
> + break;
> + }
> + }
> +}
> +
> /* {{{ tuple_compare */
>
> /*
> @@ -792,9 +1210,13 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
>
> template <bool is_nullable, bool has_optional_parts>
> static int
> -tuple_compare_sequential(const struct tuple *tuple_a,
> - const struct tuple *tuple_b, key_def *key_def)
> +tuple_compare_sequential_hinted(const struct tuple *tuple_a,
> + uint64_t tuple_a_hint, const struct tuple *tuple_b,
> + uint64_t tuple_b_hint, key_def *key_def)
> {
> + int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
> + if (rc != 0)
> + return rc;
This should go after all the assertions below.
> assert(!has_optional_parts || is_nullable);
> assert(has_optional_parts == key_def->has_optional_parts);
> assert(key_def_is_sequential(key_def));
> diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
> index afc15e809..3bd7c44f5 100644
> --- a/src/lib/coll/coll.c
> +++ b/src/lib/coll/coll.c
> @@ -133,6 +133,37 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
> return s_len;
> }
>
> +/** Get a compare hint of a binary collation. */
> +static uint64_t
> +coll_bin_hint(const char *s, size_t s_len, struct coll *coll)
> +{
> + (void)coll;
> + uint64_t result = 0;
> + uint32_t process_len = MIN(s_len, 7);
> + for (uint32_t i = 0; i < process_len; i++) {
> + result <<= 8;
> + result |= ((unsigned char *)s)[i];
> + }
> + result <<= 8 * (7 - process_len);
I don't like that you copy-n-paste string hint computation logic here
and in tuple_compare.cc. I think that it should be consolidated in
tuple_compare.cc while coll->hint should return a hint in some raw
format. I guess it could be just a wrapper around ucol_nextSortKeyPart.
May be, we should even pick another name for it.
> + return result;
> +}
> +
> +/** Get a compare hint of a string using ICU collation. */
> +static uint64_t
> +coll_icu_hint(const char *s, size_t s_len, struct coll *coll)
> +{
> + assert(coll->type == COLL_TYPE_ICU);
> + UCharIterator itr;
> + uiter_setUTF8(&itr, s, s_len);
> + uint8_t buf[7];
> + uint32_t state[2] = {0, 0};
> + UErrorCode status = U_ZERO_ERROR;
> + int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state, buf,
> + sizeof(buf), &status);
> + assert(got >= 0 && got <= 7);
> + return coll_bin_hint((const char *)buf, got, NULL);
> +}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 2/3] memtx: introduce tuple compare hint
2019-03-14 8:19 ` Vladimir Davydov
@ 2019-03-15 10:20 ` Kirill Shcherbatov
2019-03-20 18:08 ` Vladimir Davydov
0 siblings, 1 reply; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-15 10:20 UTC (permalink / raw)
To: tarantool-patches, Vladimir Davydov
> Please refresh the comment. It isn't quite correct.
I've stripped it a bit.
> On the second thought, we might want to reduce the hint size or increase
> it in future. Would it be better to introduce a helper type for hints so
> that we could easily do this if we need to?
>
> typedef uint64_t tuple_hint_t;
>
> What do you think? Would it be worthwhile? Would it look OK?
>
> Sorry, I asked you to use uint64_t before. Looks to me now that
> I was wrong... :-(
tuple_hint_t name is not really good, I believe.
At first, tuple_hint() is a routine that calculates such hint for tuple.
I prefer hint_t type that would be ok even later with multikey index intriduction.
> Hmm, (&a)->tuple is equivalent to (a).tuple, isn't it?
I don't understand why, by it doesn't work in such way.
> Accessing index->tree.arg looks ugly - it violates encapsulation.
> Besides, right above you use index_def_cmp_def instead. I think we
> better introduce a helper function for accessing index->tree.arg
> and use it everywhere:
>
> memtx_tree_cmp_def(&index->tree)
Now I use memtx_tree_index_cmp_def() routine that already present in code
and works fine in such scenario.
==============================================================
Implement functions for retrieving tuple hints for a particular
key_def. Hint is an integer that can be used for tuple comparison
optimization: if a hint of one tuple is less than a hint of another
then the first tuple is definitely less than the second; only if
hints are equal tuple_compare must be called for getting comparison
result. Hints are calculated using only the first part of key_def.
Close #3961
---
src/box/key_def.h | 128 +++++++++++++
src/box/memtx_tree.c | 40 +++-
src/box/tuple_compare.cc | 394 +++++++++++++++++++++++++++++++++++++--
src/lib/coll/coll.c | 29 +++
src/lib/coll/coll.h | 5 +
test/engine/ddl.result | 180 ++++++++++++++++++
test/engine/ddl.test.lua | 49 +++++
7 files changed, 800 insertions(+), 25 deletions(-)
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 7c9f8c6b5..174790f72 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -116,6 +116,41 @@ struct key_part {
struct key_def;
struct tuple;
+/**
+ * Tuple comparison hint h(t) is such a function of tuple t that
+ * the following conditions always hold for any pair of tuples
+ * t1 and t2:
+ * if h(t1) < h(t2) then t1 < t2;
+ * if h(t1) > h(t2) then t1 > t2;
+ * if h(t1) == h(t2) then t1 may or may not be equal to t2.
+ * These rules mean that instead of direct tuple vs tuple
+ * (or tuple vs key) comparison one may compare theirs
+ * hints first; and only if theirs hints equal compare
+ * the tuples themselves.
+ *
+ * The comparison hint has the following structure:
+ * uint64_t: [ HINT_TYPE | HINT_VALUE ]
+ * 64 60 0 bit
+ *
+ * To make the values of hints for numeric types(unsigned,
+ * integer, float, double) compatible, define a linear mapping
+ * of the segment f(x) = k*x + b = x + max + 1:
+ * f([-max - 1, max]) = [0, 2 * max + 1].
+ * It is easy to make sure that there is a < b <=> f (a) < f(b).
+ * The max value is determined by the number of bits allocated
+ * for storage of HINT_VALUE.
+ */
+typedef uint64_t hint_t;
+
+#define HINT_VALUE_BITS 60
+#define HINT_MAX (((int64_t)1 << (HINT_VALUE_BITS - 1)) - 1)
+#define HINT_MIN (-HINT_MAX - 1)
+
+/**
+ * The reserved value to use when comparison hint is undefined.
+ */
+#define HINT_NONE ((hint_t)UINT64_MAX)
+
/**
* Get is_nullable property of key_part.
* @param key_part for which attribute is being fetched
@@ -137,6 +172,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
const struct tuple *tuple_b,
struct key_def *key_def);
+/** @copydoc tuple_compare_with_key_hinted() */
+typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
+ hint_t tuple_hint,
+ const char *key,
+ uint32_t part_count,
+ hint_t key_hint,
+ struct key_def *key_def);
+/** @copydoc tuple_compare_hinted() */
+typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a,
+ hint_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ hint_t tuple_b_hint,
+ struct key_def *key_def);
/** @copydoc tuple_extract_key() */
typedef char *(*tuple_extract_key_t)(const struct tuple *tuple,
struct key_def *key_def,
@@ -152,6 +200,11 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple,
/** @copydoc key_hash() */
typedef uint32_t (*key_hash_t)(const char *key,
struct key_def *key_def);
+/** @copydoc tuple_hint() */
+typedef hint_t (*tuple_hint_t)(const struct tuple *tuple,
+ struct key_def *key_def);
+/** @copydoc key_hint() */
+typedef hint_t (*key_hint_t)(const char *key, struct key_def *key_def);
/* Definition of a multipart key. */
struct key_def {
@@ -159,6 +212,10 @@ struct key_def {
tuple_compare_t tuple_compare;
/** @see tuple_compare_with_key() */
tuple_compare_with_key_t tuple_compare_with_key;
+ /** @see tuple_compare_with_key_hinted() */
+ tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
+ /** @see tuple_compare_hinted() */
+ tuple_compare_hinted_t tuple_compare_hinted;
/** @see tuple_extract_key() */
tuple_extract_key_t tuple_extract_key;
/** @see tuple_extract_key_raw() */
@@ -167,6 +224,10 @@ struct key_def {
tuple_hash_t tuple_hash;
/** @see key_hash() */
key_hash_t key_hash;
+ /** @see tuple_hint() */
+ tuple_hint_t tuple_hint;
+ /** @see key_hint() */
+ key_hint_t key_hint;
/**
* Minimal part count which always is unique. For example,
* if a secondary index is unique, then
@@ -575,6 +636,49 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
return key_def->tuple_compare_with_key(tuple, key, part_count, key_def);
}
+/**
+ * Compare tuples using the key definition and comparison hints.
+ * @param tuple_a First tuple.
+ * @param tuple_a_hint Comparison hint of @a tuple_a.
+ * @param tuple_b Second tuple.
+ * @param tuple_b_hint Comparison hint of @a tuple_b.
+ * @param key_def Key definition.
+ * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b).
+ * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b).
+ * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b).
+ */
+static inline int
+tuple_compare_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+ const struct tuple *tuple_b, hint_t tuple_b_hint,
+ struct key_def *key_def)
+{
+ return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
+ tuple_b_hint, key_def);
+}
+
+/**
+ * Compare tuple with key using the key definition and
+ * comparison hints.
+ * @param tuple Tuple.
+ * @param tuple_hint Comparison hint of @a tuple.
+ * @param key Key parts without MessagePack array header.
+ * @param part_count The number of parts in @a key.
+ * @param key_hint t Comparison hint of @a key.
+ * @param key_def Key definition.
+ * @retval 0 if key_fields(tuple) == parts(key).
+ * @retval <0 if key_fields(tuple) < parts(key).
+ * @retval >0 if key_fields(tuple) > parts(key).
+ */
+static inline int
+tuple_compare_with_key_hinted(const struct tuple *tuple, hint_t tuple_hint,
+ const char *key, uint32_t part_count,
+ hint_t key_hint, struct key_def *key_def)
+{
+ return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
+ part_count, key_hint,
+ key_def);
+}
+
/**
* Compute hash of a tuple field.
* @param ph1 - pointer to running hash
@@ -628,6 +732,30 @@ key_hash(const char *key, struct key_def *key_def)
return key_def->key_hash(key, key_def);
}
+ /*
+ * Get comparison hint for a @a tuple.
+ * @param tuple Tuple to compute the hint for.
+ * @param key_def Key definition used for tuple comparison.
+ * @return The hint.
+ */
+static inline hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+ return key_def->tuple_hint(tuple, key_def);
+}
+
+/**
+ * Get a comparison hint of a @a key.
+ * @param key Key to compute the hint for.
+ * @param key_def Key definition used for tuple comparison.
+ * @return The hint.
+ */
+static inline hint_t
+key_hint(const char *key, struct key_def *key_def)
+{
+ return key_def->key_hint(key, key_def);
+}
+
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index d5b42efb6..c2c8a19e7 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -47,6 +47,8 @@ struct memtx_tree_key_data {
const char *key;
/** Number of msgpacked search fields. */
uint32_t part_count;
+ /** Comparison hint, see tuple_hint(). */
+ hint_t hint;
};
/**
@@ -55,6 +57,8 @@ struct memtx_tree_key_data {
struct memtx_tree_data {
/* Tuple that this node is represents. */
struct tuple *tuple;
+ /** Comparison hint, see key_hint(). */
+ hint_t hint;
};
/**
@@ -69,16 +73,18 @@ static bool
memtx_tree_data_identical(const struct memtx_tree_data *a,
const struct memtx_tree_data *b)
{
- return a->tuple == b->tuple;
+ return a->tuple == b->tuple && a->hint == b->hint;
}
#define BPS_TREE_NAME memtx_tree
#define BPS_TREE_BLOCK_SIZE (512)
#define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
#define BPS_TREE_COMPARE(a, b, arg)\
- tuple_compare((&a)->tuple, (&b)->tuple, arg)
+ tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
+ (&b)->hint, arg)
#define BPS_TREE_COMPARE_KEY(a, b, arg)\
- tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg)
+ tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\
+ (b)->part_count, (b)->hint, arg)
#define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b)
#define bps_tree_elem_t struct memtx_tree_data
#define bps_tree_key_t struct memtx_tree_key_data *
@@ -113,7 +119,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c)
const struct memtx_tree_data *data_a = a;
const struct memtx_tree_data *data_b = b;
struct key_def *key_def = c;
- return tuple_compare(data_a->tuple, data_b->tuple, key_def);
+ return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple,
+ data_b->hint, key_def);
}
/* {{{ MemtxTree Iterators ****************************************/
@@ -235,9 +242,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret)
memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
/* Use user key def to save a few loops. */
if (res == NULL ||
- tuple_compare_with_key(res->tuple, it->key_data.key,
- it->key_data.part_count,
- it->index_def->key_def) != 0) {
+ tuple_compare_with_key_hinted(res->tuple, res->hint,
+ it->key_data.key,
+ it->key_data.part_count,
+ it->key_data.hint,
+ it->index_def->key_def) != 0) {
iterator->next = tree_iterator_dummie;
it->current.tuple = NULL;
*ret = NULL;
@@ -266,9 +275,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret)
memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
/* Use user key def to save a few loops. */
if (res == NULL ||
- tuple_compare_with_key(res->tuple, it->key_data.key,
- it->key_data.part_count,
- it->index_def->key_def) != 0) {
+ tuple_compare_with_key_hinted(res->tuple, res->hint,
+ it->key_data.key,
+ it->key_data.part_count,
+ it->key_data.hint,
+ it->index_def->key_def) != 0) {
iterator->next = tree_iterator_dummie;
it->current.tuple = NULL;
*ret = NULL;
@@ -516,9 +527,11 @@ memtx_tree_index_get(struct index *base, const char *key,
assert(base->def->opts.is_unique &&
part_count == base->def->key_def->part_count);
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
struct memtx_tree_key_data key_data;
key_data.key = key;
key_data.part_count = part_count;
+ key_data.hint = key_hint(key, cmp_def);
struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
*result = res != NULL ? res->tuple : NULL;
return 0;
@@ -530,9 +543,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
struct tuple **result)
{
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *key_def = memtx_tree_index_cmp_def(index);
if (new_tuple) {
struct memtx_tree_data new_data;
new_data.tuple = new_tuple;
+ new_data.hint = tuple_hint(new_tuple, key_def);
struct memtx_tree_data dup_data;
dup_data.tuple = NULL;
@@ -565,6 +580,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
if (old_tuple) {
struct memtx_tree_data old_data;
old_data.tuple = old_tuple;
+ old_data.hint = tuple_hint(old_tuple, key_def);
memtx_tree_delete(&index->tree, old_data);
}
*result = old_tuple;
@@ -607,6 +623,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
it->type = type;
it->key_data.key = key;
it->key_data.part_count = part_count;
+ struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
+ it->key_data.hint = key_hint(key, cmp_def);
it->index_def = base->def;
it->tree = &index->tree;
it->tree_iterator = memtx_tree_invalid_iterator();
@@ -671,6 +689,8 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
struct memtx_tree_data *elem =
&index->build_array[index->build_array_size++];
elem->tuple = tuple;
+ struct key_def *key_def = memtx_tree_index_cmp_def(index);
+ elem->hint = tuple_hint(tuple, key_def);
return 0;
}
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 7fe1766a8..54a977926 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -34,6 +34,14 @@
#include "trivia/util.h" /* NOINLINE */
#include <math.h>
+/**
+ * Perform hints comparison according hint_t definition.
+ * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a;
+ * When tuple_hints are incompatible or equal, return 0.
+ */
+static int
+hint_cmp(hint_t hint_a, hint_t hint_b);
+
/* {{{ tuple_compare */
/*
@@ -427,13 +435,17 @@ tuple_compare_field_with_hint(const char *field_a, enum mp_type a_type,
template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
static inline int
-tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
- struct key_def *key_def)
+tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
+ hint_t tuple_a_hint, const struct tuple *tuple_b,
+ hint_t tuple_b_hint, struct key_def *key_def)
{
assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
+ int rc = 0;
+ if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
+ return rc;
struct key_part *part = key_def->parts;
const char *tuple_a_raw = tuple_data(tuple_a);
const char *tuple_b_raw = tuple_data(tuple_b);
@@ -469,7 +481,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
struct key_part *end;
const char *field_a, *field_b;
enum mp_type a_type, b_type;
- int rc;
if (is_nullable)
end = part + key_def->unique_part_count;
else
@@ -559,8 +570,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
static inline int
-tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
- uint32_t part_count, struct key_def *key_def)
+tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
+ struct key_def *key_def)
+{
+ return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts,
+ has_json_paths>(tuple_a, HINT_NONE, tuple_b, HINT_NONE,
+ key_def);
+}
+
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
+ hint_t tuple_hint, const char *key, uint32_t part_count,
+ hint_t key_hint, struct key_def *key_def)
{
assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
@@ -568,6 +590,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
assert(has_optional_parts == key_def->has_optional_parts);
assert(key != NULL || part_count == 0);
assert(part_count <= key_def->part_count);
+ int rc = 0;
+ if ((rc = hint_cmp(tuple_hint, key_hint)) != 0)
+ return rc;
struct key_part *part = key_def->parts;
struct tuple_format *format = tuple_format(tuple);
const char *tuple_raw = tuple_data(tuple);
@@ -603,7 +628,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
}
struct key_part *end = part + part_count;
- int rc;
for (; part < end; ++part, mp_next(&key)) {
const char *field;
if (has_json_paths) {
@@ -642,6 +666,17 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
return 0;
}
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
+ uint32_t part_count, struct key_def *key_def)
+{
+ return tuple_compare_with_key_slowpath_hinted<is_nullable,
+ has_optional_parts, has_json_paths>(tuple,
+ HINT_NONE, key, part_count,
+ HINT_NONE, key_def);
+}
+
template<bool is_nullable>
static inline int
key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
@@ -700,13 +735,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
template<bool is_nullable, bool has_optional_parts>
static inline int
-tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
- uint32_t part_count, struct key_def *key_def)
+tuple_compare_with_key_sequential_hinted(const struct tuple *tuple,
+ hint_t tuple_hint, const char *key, uint32_t part_count,
+ hint_t key_hint, struct key_def *key_def)
{
assert(!has_optional_parts || is_nullable);
assert(key_def_is_sequential(key_def));
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
const char *tuple_key = tuple_data(tuple);
uint32_t field_count = mp_decode_array(&tuple_key);
uint32_t cmp_part_count;
@@ -716,8 +755,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
assert(field_count >= part_count);
cmp_part_count = part_count;
}
- int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
- key_def);
+ rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
+ key_def);
if (!has_optional_parts || rc != 0)
return rc;
/*
@@ -739,6 +778,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
return 0;
}
+template<bool is_nullable, bool has_optional_parts>
+static inline int
+tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
+ uint32_t part_count, struct key_def *key_def)
+{
+ return tuple_compare_with_key_sequential_hinted<is_nullable,
+ has_optional_parts>(tuple, HINT_NONE, key,
+ part_count, HINT_NONE, key_def);
+}
+
int
key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
{
@@ -759,13 +808,17 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
template <bool is_nullable, bool has_optional_parts>
static int
-tuple_compare_sequential(const struct tuple *tuple_a,
- const struct tuple *tuple_b, key_def *key_def)
+tuple_compare_sequential_hinted(const struct tuple *tuple_a,
+ hint_t tuple_a_hint, const struct tuple *tuple_b,
+ hint_t tuple_b_hint, key_def *key_def)
{
assert(!has_optional_parts || is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
assert(key_def_is_sequential(key_def));
assert(is_nullable == key_def->is_nullable);
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
const char *key_a = tuple_data(tuple_a);
uint32_t fc_a = mp_decode_array(&key_a);
const char *key_b = tuple_data(tuple_b);
@@ -779,7 +832,6 @@ tuple_compare_sequential(const struct tuple *tuple_a,
bool was_null_met = false;
struct key_part *part = key_def->parts;
struct key_part *end = part + key_def->unique_part_count;
- int rc;
uint32_t i = 0;
for (; part < end; ++part, ++i) {
enum mp_type a_type, b_type;
@@ -826,6 +878,16 @@ tuple_compare_sequential(const struct tuple *tuple_a,
return 0;
}
+template <bool is_nullable, bool has_optional_parts>
+static int
+tuple_compare_sequential(const struct tuple *tuple_a,
+ const struct tuple *tuple_b, key_def *key_def)
+{
+ return tuple_compare_sequential_hinted<is_nullable,
+ has_optional_parts>(tuple_a, HINT_NONE, tuple_b,
+ HINT_NONE, key_def);
+}
+
template <int TYPE>
static inline int
field_compare(const char **field_a, const char **field_b);
@@ -956,6 +1018,18 @@ struct TupleCompare
compare(tuple_a, tuple_b, format_a,
format_b, field_a, field_b);
}
+
+ static int compare_hinted(const struct tuple *tuple_a,
+ hint_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ hint_t tuple_b_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple_a, tuple_b, key_def);
+ }
};
template <int TYPE, int ...MORE_TYPES>
@@ -973,15 +1047,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> {
return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b,
format_a, format_b, field_a, field_b);
}
+
+ static int compare_hinted(const struct tuple *tuple_a,
+ hint_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ hint_t tuple_b_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple_a, tuple_b, key_def);
+ }
};
} /* end of anonymous namespace */
struct comparator_signature {
tuple_compare_t f;
+ tuple_compare_hinted_t f_hinted;
uint32_t p[64];
};
#define COMPARATOR(...) \
- { TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } },
+ { TupleCompare<__VA_ARGS__>::compare, \
+ TupleCompare<__VA_ARGS__>::compare_hinted, \
+ { __VA_ARGS__, UINT32_MAX } },
/**
* field1 no, field1 type, field2 no, field2 type, ...
@@ -1133,6 +1222,17 @@ struct TupleCompareWithKey
compare(tuple, key, part_count,
key_def, format, field);
}
+
+ static int
+ compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+ const char *key, uint32_t part_count, hint_t key_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple, key, part_count, key_def);
+ }
};
template <int TYPE, int ...MORE_TYPES>
@@ -1153,6 +1253,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
compare(tuple, key, part_count,
key_def, format, field);
}
+
+ static int
+ compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+ const char *key, uint32_t part_count, hint_t key_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple, key, part_count, key_def);
+ }
};
} /* end of anonymous namespace */
@@ -1160,11 +1271,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
struct comparator_with_key_signature
{
tuple_compare_with_key_t f;
+ tuple_compare_with_key_hinted_t f_hinted;
uint32_t p[64];
};
#define KEY_COMPARATOR(...) \
- { TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
+ { TupleCompareWithKey<0, __VA_ARGS__>::compare, \
+ TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \
+ { __VA_ARGS__ } },
static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
@@ -1187,16 +1301,22 @@ static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
#define COMPARE_SLOWPATH(...) \
{tuple_compare_slowpath<__VA_ARGS__>, \
tuple_compare_with_key_slowpath<__VA_ARGS__>, \
+ tuple_compare_slowpath_hinted<__VA_ARGS__>, \
+ tuple_compare_with_key_slowpath_hinted<__VA_ARGS__>, \
__VA_ARGS__, false}
#define COMPARE_SEQUENTIAL(...) \
{tuple_compare_sequential<__VA_ARGS__>, \
tuple_compare_with_key_sequential<__VA_ARGS__>, \
+ tuple_compare_sequential_hinted<__VA_ARGS__>, \
+ tuple_compare_with_key_sequential_hinted<__VA_ARGS__>, \
__VA_ARGS__, false, true}
static struct {
tuple_compare_t tuple_compare;
tuple_compare_with_key_t tuple_compare_with_key;
+ tuple_compare_hinted_t tuple_compare_hinted;
+ tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
bool is_nullable;
bool has_optional_parts;
bool has_json_paths;
@@ -1221,11 +1341,236 @@ static struct {
/* }}} tuple_compare_with_key */
+static inline hint_t
+tuple_hint_new(enum mp_class type, uint64_t val)
+{
+ assert(((uint64_t)type >> (sizeof(hint_t) * CHAR_BIT - HINT_VALUE_BITS)) == 0);
+ assert(val >> HINT_VALUE_BITS == 0);
+ return (hint_t)(((uint64_t)type << HINT_VALUE_BITS) | val);
+}
+
+static int
+hint_cmp(hint_t hint_a, hint_t hint_b)
+{
+ if (hint_a != HINT_NONE && hint_b != HINT_NONE && hint_a != hint_b)
+ return hint_a < hint_b ? -1 : 1;
+ else
+ return 0;
+}
+
+static hint_t
+field_hint_unsigned(const char *field)
+{
+ assert(mp_typeof(*field) == MP_UINT);
+ uint64_t val = mp_decode_uint(&field);
+ if (val > HINT_MAX)
+ val = HINT_MAX;
+ uint64_t ret = val - (uint64_t)HINT_MIN;
+ return tuple_hint_new(MP_CLASS_NUMBER, ret);
+}
+
+static hint_t
+field_hint_integer(const char *field)
+{
+ switch (mp_typeof(*field)) {
+ case MP_INT: {
+ int64_t val = mp_decode_int(&field);
+ if (val > HINT_MAX)
+ val = HINT_MAX;
+ else if (val < HINT_MIN)
+ val = HINT_MIN;
+ uint64_t ret = (uint64_t)val - (uint64_t)HINT_MIN;
+ return tuple_hint_new(MP_CLASS_NUMBER, ret);
+ }
+ case MP_UINT:
+ return field_hint_unsigned(field);
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+static hint_t
+field_hint_number(const char *field)
+{
+ enum mp_type type = mp_typeof(*field);
+ switch (type) {
+ case MP_FLOAT:
+ case MP_DOUBLE: {
+ double f = type == MP_FLOAT ?
+ mp_decode_float(&field) : mp_decode_double(&field);
+ if (!isfinite(f))
+ return HINT_NONE;
+ uint64_t ret;
+ if (f > exp2(HINT_VALUE_BITS - 1))
+ ret = (uint64_t)HINT_MAX - (uint64_t)HINT_MIN;
+ else if (f < -exp2(HINT_VALUE_BITS - 1))
+ ret = 0;
+ else {
+ double val;
+ (void)modf(f, &val);
+ ret = (uint64_t)val - (uint64_t)HINT_MIN;
+ }
+ return tuple_hint_new(MP_CLASS_NUMBER, ret);
+ }
+ case MP_UINT:
+ return field_hint_unsigned(field);
+ case MP_INT:
+ return field_hint_integer(field);
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+static hint_t
+field_hint_boolean(const char *field)
+{
+ assert(mp_typeof(*field) == MP_BOOL);
+ return tuple_hint_new(MP_CLASS_BOOL, mp_decode_bool(&field) ? 1 : 0);
+}
+
+static uint64_t
+memory_buffer_hint_val(const unsigned char *raw, uint32_t size)
+{
+ int process_len = MIN((int)size, 7);
+ uint64_t result = 0;
+ for (int i = 0; i < process_len; i++) {
+ result <<= CHAR_BIT;
+ result |= raw[i];
+ }
+ result <<= CHAR_BIT * (7 - process_len);
+ return result;
+}
+
+template <bool has_collation>
+static hint_t
+field_hint_string(const char *field, struct coll *coll)
+{
+ assert(has_collation == (coll != NULL));
+ assert(mp_typeof(*field) == MP_STR);
+ char buff[7];
+ uint32_t len;
+ const char *str = mp_decode_str(&field, &len);
+ if (has_collation) {
+ len = coll->get_sort_key(str, len, buff, sizeof(buff), coll);
+ str = buff;
+ }
+ uint64_t ret = memory_buffer_hint_val((const unsigned char *)str, len);
+ return tuple_hint_new(MP_CLASS_STR, ret);
+}
+
+template <bool has_collation>
+static hint_t
+field_hint_scalar(const char *field, struct coll *coll)
+{
+ switch (mp_classof(mp_typeof(*field))) {
+ case MP_CLASS_NUMBER:
+ return field_hint_number(field);
+ case MP_CLASS_BOOL:
+ return field_hint_boolean(field);
+ case MP_CLASS_STR:
+ return field_hint_string<has_collation>(field, coll);
+ case MP_CLASS_BIN: {
+ assert(coll == NULL);
+ uint32_t size;
+ const unsigned char *raw =
+ (const unsigned char *)mp_decode_bin(&field, &size);
+ uint64_t ret = memory_buffer_hint_val(raw, size);
+ return tuple_hint_new(MP_CLASS_BIN, ret);
+ }
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable, bool has_collation>
+static hint_t
+field_hint(const char *field, struct key_def *key_def)
+{
+ assert(type == key_def->parts->type);
+ assert(key_part_is_nullable(key_def->parts) == is_nullable);
+ assert(has_collation == (key_def->parts->coll != NULL));
+ switch (type) {
+ case FIELD_TYPE_UNSIGNED:
+ return field_hint_unsigned(field);
+ case FIELD_TYPE_INTEGER:
+ return field_hint_integer(field);
+ case FIELD_TYPE_NUMBER:
+ return field_hint_number(field);
+ case FIELD_TYPE_BOOLEAN:
+ return field_hint_boolean(field);
+ case FIELD_TYPE_STRING:
+ return field_hint_string<has_collation>(field,
+ key_def->parts->coll);
+ case FIELD_TYPE_SCALAR:
+ return field_hint_scalar<has_collation>(field,
+ key_def->parts->coll);
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable, bool has_collation>
+static hint_t
+key_hint(const char *key, struct key_def *key_def)
+{
+ if (key == NULL || (is_nullable && mp_typeof(*key) == MP_NIL))
+ return tuple_hint_new(MP_CLASS_NIL, 0);
+ return field_hint<type, is_nullable, has_collation>(key, key_def);
+}
+
+template <enum field_type type, bool is_nullable, bool has_collation>
+static hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && (field == NULL || mp_typeof(*field) == MP_NIL))
+ return tuple_hint_new(MP_CLASS_NIL, 0);
+ return field_hint<type, is_nullable, has_collation>(field, key_def);
+}
+
+#define HINT(...) { \
+ tuple_hint<__VA_ARGS__>, key_hint<__VA_ARGS__>, __VA_ARGS__ }
+
+static const struct {
+ tuple_hint_t tuple_hint;
+ key_hint_t key_hint;
+ enum field_type type;
+ bool is_nullable;
+ bool has_collation;
+} hint_arr[] = {
+ HINT(FIELD_TYPE_UNSIGNED, false, false),
+ HINT(FIELD_TYPE_STRING, false, false),
+ HINT(FIELD_TYPE_NUMBER, false, false),
+ HINT(FIELD_TYPE_INTEGER, false, false),
+ HINT(FIELD_TYPE_BOOLEAN, false, false),
+ HINT(FIELD_TYPE_SCALAR, false, false),
+ HINT(FIELD_TYPE_UNSIGNED, true, false),
+ HINT(FIELD_TYPE_STRING, true, false),
+ HINT(FIELD_TYPE_NUMBER, true, false),
+ HINT(FIELD_TYPE_INTEGER, true, false),
+ HINT(FIELD_TYPE_BOOLEAN, true, false),
+ HINT(FIELD_TYPE_SCALAR, true, false),
+ HINT(FIELD_TYPE_STRING, false, true),
+ HINT(FIELD_TYPE_SCALAR, false, true),
+ HINT(FIELD_TYPE_STRING, true, true),
+ HINT(FIELD_TYPE_SCALAR, true, true)
+};
+
+#undef HINT
+
void
key_def_set_compare_func(struct key_def *def)
{
def->tuple_compare = NULL;
def->tuple_compare_with_key = NULL;
+ def->tuple_compare_hinted = NULL;
+ def->tuple_compare_with_key_hinted = NULL;
+ def->key_hint = NULL;
+ def->tuple_hint = NULL;
if (!def->is_nullable && !key_def_has_collation(def) &&
!def->has_json_paths) {
/* Precalculated comparators don't use collation */
@@ -1242,6 +1587,8 @@ key_def_set_compare_func(struct key_def *def)
if (i == def->part_count &&
precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) {
def->tuple_compare = precompiled_cmp_arr[k].f;
+ def->tuple_compare_hinted =
+ precompiled_cmp_arr[k].f_hinted;
break;
}
}
@@ -1258,6 +1605,8 @@ key_def_set_compare_func(struct key_def *def)
if (i == def->part_count) {
def->tuple_compare_with_key =
precompiled_cmp_wk_arr[k].f;
+ def->tuple_compare_with_key_hinted =
+ precompiled_cmp_wk_arr[k].f_hinted;
break;
}
}
@@ -1274,11 +1623,16 @@ key_def_set_compare_func(struct key_def *def)
if (def->tuple_compare == NULL) {
def->tuple_compare =
cmp_arr[k].tuple_compare;
+ def->tuple_compare_hinted =
+ cmp_arr[k].tuple_compare_hinted;
}
if (def->tuple_compare_with_key == NULL) {
def->tuple_compare_with_key =
cmp_arr[k].
tuple_compare_with_key;
+ def->tuple_compare_with_key_hinted =
+ cmp_arr[k].
+ tuple_compare_with_key_hinted;
}
break;
}
@@ -1286,4 +1640,14 @@ key_def_set_compare_func(struct key_def *def)
}
assert(def->tuple_compare != NULL &&
def->tuple_compare_with_key != NULL);
+ for (uint32_t k = 0; k < sizeof(hint_arr) / sizeof(hint_arr[0]); k++) {
+ if (def->parts->type == hint_arr[k].type &&
+ key_part_is_nullable(def->parts) ==
+ hint_arr[k].is_nullable &&
+ (def->parts->coll != NULL) == hint_arr[k].has_collation) {
+ def->key_hint = hint_arr[k].key_hint;
+ def->tuple_hint = hint_arr[k].tuple_hint;
+ break;
+ }
+ }
}
diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
index afc15e809..d110b156b 100644
--- a/src/lib/coll/coll.c
+++ b/src/lib/coll/coll.c
@@ -133,6 +133,33 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
return s_len;
}
+int
+coll_icu_get_sort_key(const char *s, size_t s_len, char *buff, size_t buff_len,
+ struct coll *coll)
+{
+ assert(coll->type == COLL_TYPE_ICU);
+ UCharIterator itr;
+ uiter_setUTF8(&itr, s, s_len);
+ uint32_t state[2] = {0, 0};
+ UErrorCode status = U_ZERO_ERROR;
+ int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state,
+ (uint8_t *)buff, buff_len, &status);
+ assert(got >= 0 && (size_t)got <= buff_len);
+ assert(U_SUCCESS(status));
+ return got;
+}
+
+int
+coll_bin_get_sort_key(const char *s, size_t s_len, char *buff, size_t buff_len,
+ struct coll *coll)
+{
+ (void)coll;
+ assert(coll->type == COLL_TYPE_BINARY);
+ int to_copy = (int)MIN(s_len, buff_len);
+ memcpy(buff, s, to_copy);
+ return to_copy;
+}
+
/**
* Set up ICU collator and init cmp and hash members of collation.
* @param coll Collation to set up.
@@ -242,6 +269,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
}
coll->cmp = coll_icu_cmp;
coll->hash = coll_icu_hash;
+ coll->get_sort_key = coll_icu_get_sort_key;
return 0;
}
@@ -356,6 +384,7 @@ coll_new(const struct coll_def *def)
coll->collator = NULL;
coll->cmp = coll_bin_cmp;
coll->hash = coll_bin_hash;
+ coll->get_sort_key = coll_bin_get_sort_key;
break;
default:
unreachable();
diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h
index 8c9f94293..a68668ec5 100644
--- a/src/lib/coll/coll.h
+++ b/src/lib/coll/coll.h
@@ -48,6 +48,9 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
uint32_t *pcarry, struct coll *coll);
+typedef int (*coll_get_sort_key_f)(const char *s, size_t s_len, char *buff,
+ size_t buff_len, struct coll *coll);
+
struct UCollator;
/**
@@ -62,6 +65,8 @@ struct coll {
/** String comparator. */
coll_cmp_f cmp;
coll_hash_f hash;
+ /** Fill memory buffer with data. */
+ coll_get_sort_key_f get_sort_key;
/** Reference counter. */
int refs;
/**
diff --git a/test/engine/ddl.result b/test/engine/ddl.result
index 8d34d5ef4..1791138c6 100644
--- a/test/engine/ddl.result
+++ b/test/engine/ddl.result
@@ -1935,6 +1935,186 @@ s:drop()
---
...
--
+-- gh-3961: Introduce tuple comparison hints
+--
+s = box.schema.space.create('test', {engine = engine})
+---
+...
+pk = s:create_index('pk')
+---
+...
+i1 = s:create_index('i1', {unique = true, parts = {2, 'scalar'}})
+---
+...
+bdbl = (2.1)^62;
+---
+...
+nbdbl = -bdbl
+---
+...
+_ = s:insert{0, bdbl}
+---
+...
+_ = s:insert{1, nbdbl}
+---
+...
+_ = s:insert{2, -2.3}
+---
+...
+_ = s:insert{3, -2}
+---
+...
+_ = s:insert{4, -1.3}
+---
+...
+_ = s:insert{5, -1}
+---
+...
+_ = s:insert{6, 0}
+---
+...
+_ = s:insert{7, 1}
+---
+...
+_ = s:insert{8, 1.3}
+---
+...
+_ = s:insert{9, 2}
+---
+...
+_ = s:insert{10, 2.3}
+---
+...
+i1:select()
+---
+- - [1, -9.4972150816945e+19]
+ - [2, -2.3]
+ - [3, -2]
+ - [4, -1.3]
+ - [5, -1]
+ - [6, 0]
+ - [7, 1]
+ - [8, 1.3]
+ - [9, 2]
+ - [10, 2.3]
+ - [0, 9.4972150816945e+19]
+...
+i1:alter{parts = {2, 'number'}}
+---
+...
+_ = s:insert{11, -1.2}
+---
+...
+_ = s:insert{12, 1.2}
+---
+...
+_ = s:insert{13, -5}
+---
+...
+_ = s:insert{14, 5}
+---
+...
+i1:select()
+---
+- - [1, -9.4972150816945e+19]
+ - [13, -5]
+ - [2, -2.3]
+ - [3, -2]
+ - [4, -1.3]
+ - [11, -1.2]
+ - [5, -1]
+ - [6, 0]
+ - [7, 1]
+ - [12, 1.2]
+ - [8, 1.3]
+ - [9, 2]
+ - [10, 2.3]
+ - [14, 5]
+ - [0, 9.4972150816945e+19]
+...
+_ = i1:delete({-2.3})
+---
+...
+_ = i1:delete({-1.3})
+---
+...
+_ = i1:delete({1.3})
+---
+...
+_ = i1:delete({2.3})
+---
+...
+_ = i1:delete({-1.2})
+---
+...
+_ = i1:delete({1.2})
+---
+...
+_ = i1:delete({bdbl})
+---
+...
+_ = i1:delete({nbdbl})
+---
+...
+i1:alter{parts = {2, 'integer'}}
+---
+...
+_ = s:insert{15, -4}
+---
+...
+_ = s:insert{16, 4}
+---
+...
+i1:select()
+---
+- - [13, -5]
+ - [15, -4]
+ - [3, -2]
+ - [5, -1]
+ - [6, 0]
+ - [7, 1]
+ - [9, 2]
+ - [16, 4]
+ - [14, 5]
+...
+_ = i1:delete({-5})
+---
+...
+_ = i1:delete({-4})
+---
+...
+_ = i1:delete({-2})
+---
+...
+_ = i1:delete({-1})
+---
+...
+i1:alter{parts = {2, 'unsigned'}}
+---
+...
+_ = s:insert({17, 10})
+---
+...
+i1:alter{parts = {2, 'scalar'}}
+---
+...
+_ = s:insert({18, 1.1})
+---
+...
+i1:select()
+---
+- - [6, 0]
+ - [7, 1]
+ - [18, 1.1]
+ - [9, 2]
+ - [16, 4]
+ - [14, 5]
+ - [17, 10]
+...
+s:drop()
+---
+...
+--
-- Creating/altering a secondary index of a non-empty space.
--
s = box.schema.space.create('test', {engine = engine})
diff --git a/test/engine/ddl.test.lua b/test/engine/ddl.test.lua
index cdaf7a5bf..a73075498 100644
--- a/test/engine/ddl.test.lua
+++ b/test/engine/ddl.test.lua
@@ -710,6 +710,55 @@ _ = s:create_index('sk', {parts = {2, 'unsigned'}})
c:get()
s:drop()
+--
+-- gh-3961: Introduce tuple comparison hints
+--
+s = box.schema.space.create('test', {engine = engine})
+pk = s:create_index('pk')
+i1 = s:create_index('i1', {unique = true, parts = {2, 'scalar'}})
+bdbl = (2.1)^62;
+nbdbl = -bdbl
+_ = s:insert{0, bdbl}
+_ = s:insert{1, nbdbl}
+_ = s:insert{2, -2.3}
+_ = s:insert{3, -2}
+_ = s:insert{4, -1.3}
+_ = s:insert{5, -1}
+_ = s:insert{6, 0}
+_ = s:insert{7, 1}
+_ = s:insert{8, 1.3}
+_ = s:insert{9, 2}
+_ = s:insert{10, 2.3}
+i1:select()
+i1:alter{parts = {2, 'number'}}
+_ = s:insert{11, -1.2}
+_ = s:insert{12, 1.2}
+_ = s:insert{13, -5}
+_ = s:insert{14, 5}
+i1:select()
+_ = i1:delete({-2.3})
+_ = i1:delete({-1.3})
+_ = i1:delete({1.3})
+_ = i1:delete({2.3})
+_ = i1:delete({-1.2})
+_ = i1:delete({1.2})
+_ = i1:delete({bdbl})
+_ = i1:delete({nbdbl})
+i1:alter{parts = {2, 'integer'}}
+_ = s:insert{15, -4}
+_ = s:insert{16, 4}
+i1:select()
+_ = i1:delete({-5})
+_ = i1:delete({-4})
+_ = i1:delete({-2})
+_ = i1:delete({-1})
+i1:alter{parts = {2, 'unsigned'}}
+_ = s:insert({17, 10})
+i1:alter{parts = {2, 'scalar'}}
+_ = s:insert({18, 1.1})
+i1:select()
+s:drop()
+
--
-- Creating/altering a secondary index of a non-empty space.
--
--
2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
2019-03-14 7:04 ` Vladimir Davydov
@ 2019-03-15 10:20 ` Kirill Shcherbatov
2019-03-15 10:55 ` Kirill Shcherbatov
2019-03-19 19:38 ` Vladimir Davydov
0 siblings, 2 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-15 10:20 UTC (permalink / raw)
To: tarantool-patches, Vladimir Davydov
Refactored key_def_set_compare_func routine to make it easier to
read, maintain, and extend. This is necessary due to the fact
that in a series of subsequent patches it will be significantly
expanded.
Needed for #3961
---
src/box/tuple_compare.cc | 195 +++++++++++++++++++--------------------
1 file changed, 93 insertions(+), 102 deletions(-)
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index dfe30b190..7fe1766a8 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -986,7 +986,7 @@ struct comparator_signature {
/**
* field1 no, field1 type, field2 no, field2 type, ...
*/
-static const comparator_signature cmp_arr[] = {
+static const comparator_signature precompiled_cmp_arr[] = {
COMPARATOR(0, FIELD_TYPE_UNSIGNED)
COMPARATOR(0, FIELD_TYPE_STRING)
COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED)
@@ -1005,55 +1005,6 @@ static const comparator_signature cmp_arr[] = {
#undef COMPARATOR
-static const tuple_compare_t compare_slowpath_funcs[] = {
- tuple_compare_slowpath<false, false, false>,
- tuple_compare_slowpath<true, false, false>,
- tuple_compare_slowpath<false, true, false>,
- tuple_compare_slowpath<true, true, false>,
- tuple_compare_slowpath<false, false, true>,
- tuple_compare_slowpath<true, false, true>,
- tuple_compare_slowpath<false, true, true>,
- tuple_compare_slowpath<true, true, true>
-};
-
-static tuple_compare_t
-tuple_compare_create(const struct key_def *def)
-{
- int cmp_func_idx = (def->is_nullable ? 1 : 0) +
- 2 * (def->has_optional_parts ? 1 : 0) +
- 4 * (def->has_json_paths ? 1 : 0);
- if (def->is_nullable) {
- if (key_def_is_sequential(def)) {
- if (def->has_optional_parts)
- return tuple_compare_sequential<true, true>;
- else
- return tuple_compare_sequential<true, false>;
- } else {
- return compare_slowpath_funcs[cmp_func_idx];
- }
- }
- assert(! def->has_optional_parts);
- if (!key_def_has_collation(def) && !def->has_json_paths) {
- /* Precalculated comparators don't use collation */
- for (uint32_t k = 0;
- k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
- uint32_t i = 0;
- for (; i < def->part_count; i++)
- if (def->parts[i].fieldno !=
- cmp_arr[k].p[i * 2] ||
- def->parts[i].type !=
- cmp_arr[k].p[i * 2 + 1])
- break;
- if (i == def->part_count &&
- cmp_arr[k].p[i * 2] == UINT32_MAX)
- return cmp_arr[k].f;
- }
- }
- return key_def_is_sequential(def) ?
- tuple_compare_sequential<false, false> :
- compare_slowpath_funcs[cmp_func_idx];
-}
-
/* }}} tuple_compare */
/* {{{ tuple_compare_with_key */
@@ -1215,7 +1166,7 @@ struct comparator_with_key_signature
#define KEY_COMPARATOR(...) \
{ TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
-static const comparator_with_key_signature cmp_wk_arr[] = {
+static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
KEY_COMPARATOR(0, FIELD_TYPE_STRING , 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_STRING , 2, FIELD_TYPE_UNSIGNED)
@@ -1231,68 +1182,108 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
KEY_COMPARATOR(1, FIELD_TYPE_STRING , 2, FIELD_TYPE_STRING)
};
-#undef KEY_COMPARATOR
-
-static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
- tuple_compare_with_key_slowpath<false, false, false>,
- tuple_compare_with_key_slowpath<true, false, false>,
- tuple_compare_with_key_slowpath<false, true, false>,
- tuple_compare_with_key_slowpath<true, true, false>,
- tuple_compare_with_key_slowpath<false, false, true>,
- tuple_compare_with_key_slowpath<true, false, true>,
- tuple_compare_with_key_slowpath<false, true, true>,
- tuple_compare_with_key_slowpath<true, true, true>
+#undef KEY_COMPARATORq
+
+#define COMPARE_SLOWPATH(...) \
+ {tuple_compare_slowpath<__VA_ARGS__>, \
+ tuple_compare_with_key_slowpath<__VA_ARGS__>, \
+ __VA_ARGS__, false}
+
+#define COMPARE_SEQUENTIAL(...) \
+ {tuple_compare_sequential<__VA_ARGS__>, \
+ tuple_compare_with_key_sequential<__VA_ARGS__>, \
+ __VA_ARGS__, false, true}
+
+static struct {
+ tuple_compare_t tuple_compare;
+ tuple_compare_with_key_t tuple_compare_with_key;
+ bool is_nullable;
+ bool has_optional_parts;
+ bool has_json_paths;
+ bool is_sequential;
+} cmp_arr[] = {
+ COMPARE_SLOWPATH(false, false, false),
+ COMPARE_SLOWPATH(true, false, false),
+ COMPARE_SLOWPATH(false, true, false),
+ COMPARE_SLOWPATH(true, true, false),
+ COMPARE_SLOWPATH(false, false, true),
+ COMPARE_SLOWPATH(true, false, true),
+ COMPARE_SLOWPATH(false, true, true),
+ COMPARE_SLOWPATH(true, true, true),
+ COMPARE_SEQUENTIAL(false, false),
+ COMPARE_SEQUENTIAL(true, false),
+ COMPARE_SEQUENTIAL(false, true),
+ COMPARE_SEQUENTIAL(true, true),
};
-static tuple_compare_with_key_t
-tuple_compare_with_key_create(const struct key_def *def)
+#undef COMPARE_SLOWPATH
+#undef COMPARE_SEQUENTIAL
+
+/* }}} tuple_compare_with_key */
+
+void
+key_def_set_compare_func(struct key_def *def)
{
- int cmp_func_idx = (def->is_nullable ? 1 : 0) +
- 2 * (def->has_optional_parts ? 1 : 0) +
- 4 * (def->has_json_paths ? 1 : 0);
- if (def->is_nullable) {
- if (key_def_is_sequential(def)) {
- if (def->has_optional_parts) {
- return tuple_compare_with_key_sequential<true,
- true>;
- } else {
- return tuple_compare_with_key_sequential<true,
- false>;
+ def->tuple_compare = NULL;
+ def->tuple_compare_with_key = NULL;
+ if (!def->is_nullable && !key_def_has_collation(def) &&
+ !def->has_json_paths) {
+ /* Precalculated comparators don't use collation */
+ for (uint32_t k = 0; k < sizeof(precompiled_cmp_arr) /
+ sizeof(precompiled_cmp_arr[0]); k++) {
+ uint32_t i = 0;
+ for (; i < def->part_count; i++) {
+ if (def->parts[i].fieldno !=
+ precompiled_cmp_arr[k].p[i * 2] ||
+ def->parts[i].type !=
+ precompiled_cmp_arr[k].p[i * 2 + 1])
+ break;
+ }
+ if (i == def->part_count &&
+ precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) {
+ def->tuple_compare = precompiled_cmp_arr[k].f;
+ break;
}
- } else {
- return compare_with_key_slowpath_funcs[cmp_func_idx];
}
- }
- assert(! def->has_optional_parts);
- if (!key_def_has_collation(def) && !def->has_json_paths) {
- /* Precalculated comparators don't use collation */
- for (uint32_t k = 0;
- k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
- k++) {
-
+ for (uint32_t k = 0; k < sizeof(precompiled_cmp_wk_arr) /
+ sizeof(precompiled_cmp_wk_arr[0]); k++) {
uint32_t i = 0;
for (; i < def->part_count; i++) {
if (def->parts[i].fieldno !=
- cmp_wk_arr[k].p[i * 2] ||
- def->parts[i].type !=
- cmp_wk_arr[k].p[i * 2 + 1]) {
+ precompiled_cmp_wk_arr[k].p[i * 2] ||
+ def->parts[i].type !=
+ precompiled_cmp_wk_arr[k].p[i * 2 + 1])
break;
+ }
+ if (i == def->part_count) {
+ def->tuple_compare_with_key =
+ precompiled_cmp_wk_arr[k].f;
+ break;
+ }
+ }
+ }
+ if (def->tuple_compare == NULL || def->tuple_compare_with_key == NULL) {
+ for (uint32_t k = 0; k < sizeof(cmp_arr) /
+ sizeof(cmp_arr[0]); k++) {
+ if (def->is_nullable == cmp_arr[k].is_nullable &&
+ def->has_optional_parts ==
+ cmp_arr[k].has_optional_parts &&
+ def->has_json_paths == cmp_arr[k].has_json_paths &&
+ key_def_is_sequential(def) ==
+ cmp_arr[k].is_sequential) {
+ if (def->tuple_compare == NULL) {
+ def->tuple_compare =
+ cmp_arr[k].tuple_compare;
}
+ if (def->tuple_compare_with_key == NULL) {
+ def->tuple_compare_with_key =
+ cmp_arr[k].
+ tuple_compare_with_key;
+ }
+ break;
}
- if (i == def->part_count)
- return cmp_wk_arr[k].f;
}
}
- return key_def_is_sequential(def) ?
- tuple_compare_with_key_sequential<false, false> :
- compare_with_key_slowpath_funcs[cmp_func_idx];
-}
-
-/* }}} tuple_compare_with_key */
-
-void
-key_def_set_compare_func(struct key_def *def)
-{
- def->tuple_compare = tuple_compare_create(def);
- def->tuple_compare_with_key = tuple_compare_with_key_create(def);
+ assert(def->tuple_compare != NULL &&
+ def->tuple_compare_with_key != NULL);
}
--
2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov
@ 2019-03-15 10:55 ` Kirill Shcherbatov
2019-03-19 19:38 ` Vladimir Davydov
1 sibling, 0 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-15 10:55 UTC (permalink / raw)
To: tarantool-patches, Vladimir Davydov
@@ -1182,7 +1182,7 @@ static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
KEY_COMPARATOR(1, FIELD_TYPE_STRING , 2, FIELD_TYPE_STRING)
};
-#undef KEY_COMPARATORq
+#undef KEY_COMPARATOR
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov
2019-03-15 10:55 ` Kirill Shcherbatov
@ 2019-03-19 19:38 ` Vladimir Davydov
1 sibling, 0 replies; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-19 19:38 UTC (permalink / raw)
To: Kirill Shcherbatov; +Cc: tarantool-patches
On Fri, Mar 15, 2019 at 01:20:11PM +0300, Kirill Shcherbatov wrote:
> +#define COMPARE_SLOWPATH(...) \
> + {tuple_compare_slowpath<__VA_ARGS__>, \
> + tuple_compare_with_key_slowpath<__VA_ARGS__>, \
> + __VA_ARGS__, false}
> +
> +#define COMPARE_SEQUENTIAL(...) \
> + {tuple_compare_sequential<__VA_ARGS__>, \
> + tuple_compare_with_key_sequential<__VA_ARGS__>, \
> + __VA_ARGS__, false, true}
> +
> +static struct {
> + tuple_compare_t tuple_compare;
> + tuple_compare_with_key_t tuple_compare_with_key;
> + bool is_nullable;
> + bool has_optional_parts;
> + bool has_json_paths;
> + bool is_sequential;
> +} cmp_arr[] = {
> + COMPARE_SLOWPATH(false, false, false),
> + COMPARE_SLOWPATH(true, false, false),
> + COMPARE_SLOWPATH(false, true, false),
> + COMPARE_SLOWPATH(true, true, false),
> + COMPARE_SLOWPATH(false, false, true),
> + COMPARE_SLOWPATH(true, false, true),
> + COMPARE_SLOWPATH(false, true, true),
> + COMPARE_SLOWPATH(true, true, true),
> + COMPARE_SEQUENTIAL(false, false),
> + COMPARE_SEQUENTIAL(true, false),
> + COMPARE_SEQUENTIAL(false, true),
> + COMPARE_SEQUENTIAL(true, true),
> };
>
> -static tuple_compare_with_key_t
> -tuple_compare_with_key_create(const struct key_def *def)
> +#undef COMPARE_SLOWPATH
> +#undef COMPARE_SEQUENTIAL
> +
> +/* }}} tuple_compare_with_key */
> +
> +void
> +key_def_set_compare_func(struct key_def *def)
> {
> - int cmp_func_idx = (def->is_nullable ? 1 : 0) +
> - 2 * (def->has_optional_parts ? 1 : 0) +
> - 4 * (def->has_json_paths ? 1 : 0);
> - if (def->is_nullable) {
> - if (key_def_is_sequential(def)) {
> - if (def->has_optional_parts) {
> - return tuple_compare_with_key_sequential<true,
> - true>;
> - } else {
> - return tuple_compare_with_key_sequential<true,
> - false>;
> + def->tuple_compare = NULL;
> + def->tuple_compare_with_key = NULL;
> + if (!def->is_nullable && !key_def_has_collation(def) &&
> + !def->has_json_paths) {
> + /* Precalculated comparators don't use collation */
> + for (uint32_t k = 0; k < sizeof(precompiled_cmp_arr) /
> + sizeof(precompiled_cmp_arr[0]); k++) {
> + uint32_t i = 0;
> + for (; i < def->part_count; i++) {
> + if (def->parts[i].fieldno !=
> + precompiled_cmp_arr[k].p[i * 2] ||
> + def->parts[i].type !=
> + precompiled_cmp_arr[k].p[i * 2 + 1])
> + break;
> + }
> + if (i == def->part_count &&
> + precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) {
> + def->tuple_compare = precompiled_cmp_arr[k].f;
> + break;
> }
> - } else {
> - return compare_with_key_slowpath_funcs[cmp_func_idx];
> }
> - }
> - assert(! def->has_optional_parts);
> - if (!key_def_has_collation(def) && !def->has_json_paths) {
> - /* Precalculated comparators don't use collation */
> - for (uint32_t k = 0;
> - k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
> - k++) {
> -
> + for (uint32_t k = 0; k < sizeof(precompiled_cmp_wk_arr) /
> + sizeof(precompiled_cmp_wk_arr[0]); k++) {
> uint32_t i = 0;
> for (; i < def->part_count; i++) {
> if (def->parts[i].fieldno !=
> - cmp_wk_arr[k].p[i * 2] ||
> - def->parts[i].type !=
> - cmp_wk_arr[k].p[i * 2 + 1]) {
> + precompiled_cmp_wk_arr[k].p[i * 2] ||
> + def->parts[i].type !=
> + precompiled_cmp_wk_arr[k].p[i * 2 + 1])
> break;
> + }
> + if (i == def->part_count) {
> + def->tuple_compare_with_key =
> + precompiled_cmp_wk_arr[k].f;
> + break;
> + }
> + }
> + }
> + if (def->tuple_compare == NULL || def->tuple_compare_with_key == NULL) {
> + for (uint32_t k = 0; k < sizeof(cmp_arr) /
> + sizeof(cmp_arr[0]); k++) {
> + if (def->is_nullable == cmp_arr[k].is_nullable &&
> + def->has_optional_parts ==
> + cmp_arr[k].has_optional_parts &&
> + def->has_json_paths == cmp_arr[k].has_json_paths &&
> + key_def_is_sequential(def) ==
> + cmp_arr[k].is_sequential) {
> + if (def->tuple_compare == NULL) {
> + def->tuple_compare =
> + cmp_arr[k].tuple_compare;
> }
> + if (def->tuple_compare_with_key == NULL) {
> + def->tuple_compare_with_key =
> + cmp_arr[k].
> + tuple_compare_with_key;
> + }
> + break;
> }
> - if (i == def->part_count)
> - return cmp_wk_arr[k].f;
> }
> }
All these unnamed booleans and variadic macros and linear search over
them look mind-boggling IMO. I understand that this approach is kinda
generic, but it just looks ugly. In fact, I don't see any point in such
a generalization at this point at all. See, although we'll have four
parameters - is_nullable, has_optional_parts, has_json_paths, and
is_multikey - they are not independent:
- has_optional_parts depends on is_nullable
- is_multikey depends on has_json_paths
So the number of combination is far not staggering, and I don't see any
new parameters coming in the near future.
So I dropped this patch and instead split key_def_set_compare_func into
sub-functions by key_def types. You will find the patch below. It should
be trivial to add hinted comparators and multikey support on top of it
without introducing deep if-else hiearchies or excessive code repetition.
---
From 829c811cdeb98bd8e5cb91edf2197f0c340d55ed Mon Sep 17 00:00:00 2001
Message-Id: <829c811cdeb98bd8e5cb91edf2197f0c340d55ed.1553024132.git.vdavydov.dev@gmail.com>
From: Vladimir Davydov <vdavydov.dev@gmail.com>
Date: Tue, 19 Mar 2019 20:27:42 +0300
Subject: [PATCH] tuple_compare: cleanup key_def virtual func setter
Over time key_def virtual func setter code has turned into a complete
mess: now it's an incomprehensible mix of if-else blocks, linear array
search, and devious array indexing. What is worse, it's duplicated by
tuple_compare_create and tuple_compare_with_key_create. Adding yet
another parameter to key_def templates, which is needed to implement
multi-key indexes, is literally impossible.
This patch attempts to alleviate the situation by splitting code paths
for pre-compiled, plain, and json indexes plus merging tuple_compare and
tuple_compare_with_key setters.
---
src/box/tuple_compare.cc | 211 +++++++++++++++++++++++------------------------
1 file changed, 105 insertions(+), 106 deletions(-)
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index dfe30b19..19b396e2 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -1005,55 +1005,6 @@ static const comparator_signature cmp_arr[] = {
#undef COMPARATOR
-static const tuple_compare_t compare_slowpath_funcs[] = {
- tuple_compare_slowpath<false, false, false>,
- tuple_compare_slowpath<true, false, false>,
- tuple_compare_slowpath<false, true, false>,
- tuple_compare_slowpath<true, true, false>,
- tuple_compare_slowpath<false, false, true>,
- tuple_compare_slowpath<true, false, true>,
- tuple_compare_slowpath<false, true, true>,
- tuple_compare_slowpath<true, true, true>
-};
-
-static tuple_compare_t
-tuple_compare_create(const struct key_def *def)
-{
- int cmp_func_idx = (def->is_nullable ? 1 : 0) +
- 2 * (def->has_optional_parts ? 1 : 0) +
- 4 * (def->has_json_paths ? 1 : 0);
- if (def->is_nullable) {
- if (key_def_is_sequential(def)) {
- if (def->has_optional_parts)
- return tuple_compare_sequential<true, true>;
- else
- return tuple_compare_sequential<true, false>;
- } else {
- return compare_slowpath_funcs[cmp_func_idx];
- }
- }
- assert(! def->has_optional_parts);
- if (!key_def_has_collation(def) && !def->has_json_paths) {
- /* Precalculated comparators don't use collation */
- for (uint32_t k = 0;
- k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
- uint32_t i = 0;
- for (; i < def->part_count; i++)
- if (def->parts[i].fieldno !=
- cmp_arr[k].p[i * 2] ||
- def->parts[i].type !=
- cmp_arr[k].p[i * 2 + 1])
- break;
- if (i == def->part_count &&
- cmp_arr[k].p[i * 2] == UINT32_MAX)
- return cmp_arr[k].f;
- }
- }
- return key_def_is_sequential(def) ?
- tuple_compare_sequential<false, false> :
- compare_slowpath_funcs[cmp_func_idx];
-}
-
/* }}} tuple_compare */
/* {{{ tuple_compare_with_key */
@@ -1233,66 +1184,114 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
#undef KEY_COMPARATOR
-static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
- tuple_compare_with_key_slowpath<false, false, false>,
- tuple_compare_with_key_slowpath<true, false, false>,
- tuple_compare_with_key_slowpath<false, true, false>,
- tuple_compare_with_key_slowpath<true, true, false>,
- tuple_compare_with_key_slowpath<false, false, true>,
- tuple_compare_with_key_slowpath<true, false, true>,
- tuple_compare_with_key_slowpath<false, true, true>,
- tuple_compare_with_key_slowpath<true, true, true>
-};
-
-static tuple_compare_with_key_t
-tuple_compare_with_key_create(const struct key_def *def)
-{
- int cmp_func_idx = (def->is_nullable ? 1 : 0) +
- 2 * (def->has_optional_parts ? 1 : 0) +
- 4 * (def->has_json_paths ? 1 : 0);
- if (def->is_nullable) {
- if (key_def_is_sequential(def)) {
- if (def->has_optional_parts) {
- return tuple_compare_with_key_sequential<true,
- true>;
- } else {
- return tuple_compare_with_key_sequential<true,
- false>;
- }
- } else {
- return compare_with_key_slowpath_funcs[cmp_func_idx];
- }
- }
- assert(! def->has_optional_parts);
- if (!key_def_has_collation(def) && !def->has_json_paths) {
- /* Precalculated comparators don't use collation */
- for (uint32_t k = 0;
- k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
- k++) {
-
- uint32_t i = 0;
- for (; i < def->part_count; i++) {
- if (def->parts[i].fieldno !=
- cmp_wk_arr[k].p[i * 2] ||
- def->parts[i].type !=
- cmp_wk_arr[k].p[i * 2 + 1]) {
- break;
- }
- }
- if (i == def->part_count)
- return cmp_wk_arr[k].f;
- }
- }
- return key_def_is_sequential(def) ?
- tuple_compare_with_key_sequential<false, false> :
- compare_with_key_slowpath_funcs[cmp_func_idx];
-}
-
/* }}} tuple_compare_with_key */
+static void
+key_def_set_compare_func_fast(struct key_def *def)
+{
+ assert(!def->is_nullable);
+ assert(!def->has_optional_parts);
+ assert(!def->has_json_paths);
+ assert(!key_def_has_collation(def));
+
+ tuple_compare_t cmp = NULL;
+ tuple_compare_with_key_t cmp_wk = NULL;
+ bool is_sequential = key_def_is_sequential(def);
+
+ /*
+ * Use pre-compiled comparators if available, otherwise
+ * fall back on generic comparators.
+ */
+ for (uint32_t k = 0; k < lengthof(cmp_arr); k++) {
+ uint32_t i = 0;
+ for (; i < def->part_count; i++)
+ if (def->parts[i].fieldno != cmp_arr[k].p[i * 2] ||
+ def->parts[i].type != cmp_arr[k].p[i * 2 + 1])
+ break;
+ if (i == def->part_count && cmp_arr[k].p[i * 2] == UINT32_MAX) {
+ cmp = cmp_arr[k].f;
+ break;
+ }
+ }
+ for (uint32_t k = 0; k < lengthof(cmp_wk_arr); k++) {
+ uint32_t i = 0;
+ for (; i < def->part_count; i++) {
+ if (def->parts[i].fieldno != cmp_wk_arr[k].p[i * 2] ||
+ def->parts[i].type != cmp_wk_arr[k].p[i * 2 + 1])
+ break;
+ }
+ if (i == def->part_count) {
+ cmp_wk = cmp_wk_arr[k].f;
+ break;
+ }
+ }
+ if (cmp == NULL) {
+ cmp = is_sequential ?
+ tuple_compare_sequential<false, false> :
+ tuple_compare_slowpath<false, false, false>;
+ }
+ if (cmp_wk == NULL) {
+ cmp_wk = is_sequential ?
+ tuple_compare_with_key_sequential<false, false> :
+ tuple_compare_with_key_slowpath<false, false, false>;
+ }
+
+ def->tuple_compare = cmp;
+ def->tuple_compare_with_key = cmp_wk;
+}
+
+template<bool is_nullable, bool has_optional_parts>
+static void
+key_def_set_compare_func_plain(struct key_def *def)
+{
+ assert(!def->has_json_paths);
+ if (key_def_is_sequential(def)) {
+ def->tuple_compare = tuple_compare_sequential
+ <is_nullable, has_optional_parts>;
+ def->tuple_compare_with_key = tuple_compare_with_key_sequential
+ <is_nullable, has_optional_parts>;
+ } else {
+ def->tuple_compare = tuple_compare_slowpath
+ <is_nullable, has_optional_parts, false>;
+ def->tuple_compare_with_key = tuple_compare_with_key_slowpath
+ <is_nullable, has_optional_parts, false>;
+ }
+}
+
+template<bool is_nullable, bool has_optional_parts>
+static void
+key_def_set_compare_func_json(struct key_def *def)
+{
+ assert(def->has_json_paths);
+ def->tuple_compare = tuple_compare_slowpath
+ <is_nullable, has_optional_parts, true>;
+ def->tuple_compare_with_key = tuple_compare_with_key_slowpath
+ <is_nullable, has_optional_parts, true>;
+}
+
void
key_def_set_compare_func(struct key_def *def)
{
- def->tuple_compare = tuple_compare_create(def);
- def->tuple_compare_with_key = tuple_compare_with_key_create(def);
+ if (!key_def_has_collation(def) &&
+ !def->is_nullable && !def->has_json_paths) {
+ key_def_set_compare_func_fast(def);
+ } else if (!def->has_json_paths) {
+ if (def->is_nullable && def->has_optional_parts) {
+ key_def_set_compare_func_plain<true, true>(def);
+ } else if (def->is_nullable && !def->has_optional_parts) {
+ key_def_set_compare_func_plain<true, false>(def);
+ } else {
+ assert(!def->is_nullable && !def->has_optional_parts);
+ key_def_set_compare_func_plain<false, false>(def);
+ }
+ } else {
+ if (def->is_nullable && def->has_optional_parts) {
+ key_def_set_compare_func_json<true, true>(def);
+ } else if (def->is_nullable && !def->has_optional_parts) {
+ key_def_set_compare_func_json<true, false>(def);
+ } else {
+ assert(!def->is_nullable && !def->has_optional_parts);
+ key_def_set_compare_func_json<false, false>(def);
+ }
+ }
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [tarantool-patches] Re: [PATCH v6 2/3] memtx: introduce tuple compare hint
2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov
@ 2019-03-20 18:08 ` Vladimir Davydov
0 siblings, 0 replies; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-20 18:08 UTC (permalink / raw)
To: Kirill Shcherbatov; +Cc: tarantool-patches
Pushed to master with some changes. You will find the new patch below.
I removed the tests, because I find it insufficient. Please cover all
possible test cases and submit a separate patch.
---
From 9fba29abb4e05babb9b23b4413bd8083f0fba933 Mon Sep 17 00:00:00 2001
Message-Id: <9fba29abb4e05babb9b23b4413bd8083f0fba933.1553105230.git.vdavydov.dev@gmail.com>
From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
Date: Wed, 13 Mar 2019 13:06:30 +0300
Subject: [PATCH] memtx: introduce tuple compare hint
Implement functions for retrieving tuple hints for a particular
key_def. Hint is an integer that can be used for tuple comparison
optimization: if a hint of one tuple is less than a hint of another
then the first tuple is definitely less than the second; only if
hints are equal tuple_compare must be called for getting comparison
result. Hints are calculated using only the first part of key_def.
@locker:
- Rework key_def_set_hint_func.
- Refactor functions calculating hints.
- Drop has_collation template argument (we don't use templates
for collations anywhere else).
- Add part_count argument to key_hint (it's conventional to pass
part_count along with decoded key).
- Improve comments, rename a few functions, and cleanup code.
Close #3961
---
src/box/key_def.h | 115 ++++++++++
src/box/memtx_tree.c | 40 +++-
src/box/tuple_compare.cc | 535 +++++++++++++++++++++++++++++++++++++++++++++--
src/lib/coll/coll.c | 26 +++
src/lib/coll/coll.h | 12 ++
5 files changed, 702 insertions(+), 26 deletions(-)
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 7c9f8c6b..288cf727 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -117,6 +117,27 @@ struct key_def;
struct tuple;
/**
+ * Tuple comparison hint h(t) is such a function of tuple t that
+ * the following conditions always hold for any pair of tuples
+ * t1 and t2:
+ *
+ * if h(t1) < h(t2) then t1 < t2;
+ * if h(t1) > h(t2) then t1 > t2;
+ * if h(t1) == h(t2) then t1 may or may not be equal to t2.
+ *
+ * These rules mean that instead of direct tuple vs tuple
+ * (or tuple vs key) comparison one may compare their hints
+ * first and only if theirs hints equal compare the tuples
+ * themselves.
+ */
+typedef uint64_t hint_t;
+
+/**
+ * Reserved value to use when comparison hint is undefined.
+ */
+#define HINT_NONE ((hint_t)UINT64_MAX)
+
+/**
* Get is_nullable property of key_part.
* @param key_part for which attribute is being fetched
*
@@ -133,10 +154,23 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
const char *key,
uint32_t part_count,
struct key_def *key_def);
+/** @copydoc tuple_compare_with_key_hinted() */
+typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
+ hint_t tuple_hint,
+ const char *key,
+ uint32_t part_count,
+ hint_t key_hint,
+ struct key_def *key_def);
/** @copydoc tuple_compare() */
typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
const struct tuple *tuple_b,
struct key_def *key_def);
+/** @copydoc tuple_compare_hinted() */
+typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a,
+ hint_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ hint_t tuple_b_hint,
+ struct key_def *key_def);
/** @copydoc tuple_extract_key() */
typedef char *(*tuple_extract_key_t)(const struct tuple *tuple,
struct key_def *key_def,
@@ -152,13 +186,23 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple,
/** @copydoc key_hash() */
typedef uint32_t (*key_hash_t)(const char *key,
struct key_def *key_def);
+/** @copydoc tuple_hint() */
+typedef hint_t (*tuple_hint_t)(const struct tuple *tuple,
+ struct key_def *key_def);
+/** @copydoc key_hint() */
+typedef hint_t (*key_hint_t)(const char *key, uint32_t part_count,
+ struct key_def *key_def);
/* Definition of a multipart key. */
struct key_def {
/** @see tuple_compare() */
tuple_compare_t tuple_compare;
+ /** @see tuple_compare_hinted() */
+ tuple_compare_hinted_t tuple_compare_hinted;
/** @see tuple_compare_with_key() */
tuple_compare_with_key_t tuple_compare_with_key;
+ /** @see tuple_compare_with_key_hinted() */
+ tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
/** @see tuple_extract_key() */
tuple_extract_key_t tuple_extract_key;
/** @see tuple_extract_key_raw() */
@@ -167,6 +211,10 @@ struct key_def {
tuple_hash_t tuple_hash;
/** @see key_hash() */
key_hash_t key_hash;
+ /** @see tuple_hint() */
+ tuple_hint_t tuple_hint;
+ /** @see key_hint() */
+ key_hint_t key_hint;
/**
* Minimal part count which always is unique. For example,
* if a secondary index is unique, then
@@ -558,6 +606,26 @@ tuple_compare(const struct tuple *tuple_a, const struct tuple *tuple_b,
}
/**
+ * Compare tuples using the key definition and comparison hints.
+ * @param tuple_a first tuple
+ * @param tuple_a_hint comparison hint of @a tuple_a
+ * @param tuple_b second tuple
+ * @param tuple_b_hint comparison hint of @a tuple_b
+ * @param key_def key definition
+ * @retval 0 if key_fields(tuple_a) == key_fields(tuple_b)
+ * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b)
+ * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b)
+ */
+static inline int
+tuple_compare_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+ const struct tuple *tuple_b, hint_t tuple_b_hint,
+ struct key_def *key_def)
+{
+ return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
+ tuple_b_hint, key_def);
+}
+
+/**
* @brief Compare tuple with key using the key definition.
* @param tuple tuple
* @param key key parts without MessagePack array header
@@ -576,6 +644,29 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
}
/**
+ * Compare tuple with key using the key definition and
+ * comparison hints
+ * @param tuple tuple
+ * @param tuple_hint comparison hint of @a tuple
+ * @param key key parts without MessagePack array header
+ * @param part_count the number of parts in @a key
+ * @param key_hint comparison hint of @a key
+ * @param key_def key definition
+ * @retval 0 if key_fields(tuple) == parts(key)
+ * @retval <0 if key_fields(tuple) < parts(key)
+ * @retval >0 if key_fields(tuple) > parts(key)
+ */
+static inline int
+tuple_compare_with_key_hinted(const struct tuple *tuple, hint_t tuple_hint,
+ const char *key, uint32_t part_count,
+ hint_t key_hint, struct key_def *key_def)
+{
+ return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
+ part_count, key_hint,
+ key_def);
+}
+
+/**
* Compute hash of a tuple field.
* @param ph1 - pointer to running hash
* @param pcarry - pointer to carry
@@ -628,6 +719,30 @@ key_hash(const char *key, struct key_def *key_def)
return key_def->key_hash(key, key_def);
}
+ /*
+ * Get comparison hint for a tuple.
+ * @param tuple - tuple to compute the hint for
+ * @param key_def - key_def used for tuple comparison
+ * @return - hint value
+ */
+static inline hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+ return key_def->tuple_hint(tuple, key_def);
+}
+
+/**
+ * Get a comparison hint of a key.
+ * @param key - key to compute the hint for
+ * @param key_def - key_def used for tuple comparison
+ * @return - hint value
+ */
+static inline hint_t
+key_hint(const char *key, uint32_t part_count, struct key_def *key_def)
+{
+ return key_def->key_hint(key, part_count, key_def);
+}
+
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index e24c4032..fe037c54 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -47,6 +47,8 @@ struct memtx_tree_key_data {
const char *key;
/** Number of msgpacked search fields. */
uint32_t part_count;
+ /** Comparison hint, see tuple_hint(). */
+ hint_t hint;
};
/**
@@ -55,6 +57,8 @@ struct memtx_tree_key_data {
struct memtx_tree_data {
/* Tuple that this node is represents. */
struct tuple *tuple;
+ /** Comparison hint, see key_hint(). */
+ hint_t hint;
};
/**
@@ -69,16 +73,18 @@ static bool
memtx_tree_data_identical(const struct memtx_tree_data *a,
const struct memtx_tree_data *b)
{
- return a->tuple == b->tuple;
+ return a->tuple == b->tuple && a->hint == b->hint;
}
#define BPS_TREE_NAME memtx_tree
#define BPS_TREE_BLOCK_SIZE (512)
#define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
#define BPS_TREE_COMPARE(a, b, arg)\
- tuple_compare((&a)->tuple, (&b)->tuple, arg)
+ tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
+ (&b)->hint, arg)
#define BPS_TREE_COMPARE_KEY(a, b, arg)\
- tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg)
+ tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\
+ (b)->part_count, (b)->hint, arg)
#define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b)
#define bps_tree_elem_t struct memtx_tree_data
#define bps_tree_key_t struct memtx_tree_key_data *
@@ -119,7 +125,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c)
const struct memtx_tree_data *data_a = a;
const struct memtx_tree_data *data_b = b;
struct key_def *key_def = c;
- return tuple_compare(data_a->tuple, data_b->tuple, key_def);
+ return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple,
+ data_b->hint, key_def);
}
/* {{{ MemtxTree Iterators ****************************************/
@@ -241,9 +248,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret)
memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
/* Use user key def to save a few loops. */
if (res == NULL ||
- tuple_compare_with_key(res->tuple, it->key_data.key,
- it->key_data.part_count,
- it->index_def->key_def) != 0) {
+ tuple_compare_with_key_hinted(res->tuple, res->hint,
+ it->key_data.key,
+ it->key_data.part_count,
+ it->key_data.hint,
+ it->index_def->key_def) != 0) {
iterator->next = tree_iterator_dummie;
it->current.tuple = NULL;
*ret = NULL;
@@ -272,9 +281,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret)
memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
/* Use user key def to save a few loops. */
if (res == NULL ||
- tuple_compare_with_key(res->tuple, it->key_data.key,
- it->key_data.part_count,
- it->index_def->key_def) != 0) {
+ tuple_compare_with_key_hinted(res->tuple, res->hint,
+ it->key_data.key,
+ it->key_data.part_count,
+ it->key_data.hint,
+ it->index_def->key_def) != 0) {
iterator->next = tree_iterator_dummie;
it->current.tuple = NULL;
*ret = NULL;
@@ -513,9 +524,11 @@ memtx_tree_index_get(struct index *base, const char *key,
assert(base->def->opts.is_unique &&
part_count == base->def->key_def->part_count);
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
struct memtx_tree_key_data key_data;
key_data.key = key;
key_data.part_count = part_count;
+ key_data.hint = key_hint(key, part_count, cmp_def);
struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
*result = res != NULL ? res->tuple : NULL;
return 0;
@@ -527,9 +540,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
struct tuple **result)
{
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
if (new_tuple) {
struct memtx_tree_data new_data;
new_data.tuple = new_tuple;
+ new_data.hint = tuple_hint(new_tuple, cmp_def);
struct memtx_tree_data dup_data;
dup_data.tuple = NULL;
@@ -562,6 +577,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
if (old_tuple) {
struct memtx_tree_data old_data;
old_data.tuple = old_tuple;
+ old_data.hint = tuple_hint(old_tuple, cmp_def);
memtx_tree_delete(&index->tree, old_data);
}
*result = old_tuple;
@@ -574,6 +590,7 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
{
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
struct memtx_engine *memtx = (struct memtx_engine *)base->engine;
+ struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
assert(part_count == 0 || key != NULL);
if (type > ITER_GT) {
@@ -604,6 +621,7 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
it->type = type;
it->key_data.key = key;
it->key_data.part_count = part_count;
+ it->key_data.hint = key_hint(key, part_count, cmp_def);
it->index_def = base->def;
it->tree = &index->tree;
it->tree_iterator = memtx_tree_invalid_iterator();
@@ -641,6 +659,7 @@ static int
memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
{
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+ struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
if (index->build_array == NULL) {
index->build_array = malloc(MEMTX_EXTENT_SIZE);
if (index->build_array == NULL) {
@@ -668,6 +687,7 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
struct memtx_tree_data *elem =
&index->build_array[index->build_array_size++];
elem->tuple = tuple;
+ elem->hint = tuple_hint(tuple, cmp_def);
return 0;
}
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 595f6f25..93756365 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -36,6 +36,25 @@
/* {{{ tuple_compare */
+/**
+ * Compare two tuple hints.
+ *
+ * Returns:
+ *
+ * -1 if the first tuple is less than the second tuple
+ * +1 if the first tuple is greater than the second tuple
+ * 0 if the first tuple may be less than, equal to, or
+ * greater than the second tuple, and a full tuple
+ * comparison is needed to determine the order.
+ */
+static inline int
+hint_cmp(hint_t hint_a, hint_t hint_b)
+{
+ if (hint_a != HINT_NONE && hint_b != HINT_NONE && hint_a != hint_b)
+ return hint_a < hint_b ? -1 : 1;
+ return 0;
+}
+
/*
* Compare two tuple fields.
* Separate version exists since compare is a very
@@ -53,7 +72,8 @@ enum mp_class {
MP_CLASS_STR,
MP_CLASS_BIN,
MP_CLASS_ARRAY,
- MP_CLASS_MAP
+ MP_CLASS_MAP,
+ mp_class_max,
};
static enum mp_class mp_classes[] = {
@@ -427,13 +447,17 @@ tuple_compare_field_with_type(const char *field_a, enum mp_type a_type,
template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
static inline int
-tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
- struct key_def *key_def)
+tuple_compare_slowpath_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+ const struct tuple *tuple_b, hint_t tuple_b_hint,
+ struct key_def *key_def)
{
assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
struct key_part *part = key_def->parts;
const char *tuple_a_raw = tuple_data(tuple_a);
const char *tuple_b_raw = tuple_data(tuple_b);
@@ -469,7 +493,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
struct key_part *end;
const char *field_a, *field_b;
enum mp_type a_type, b_type;
- int rc;
if (is_nullable)
end = part + key_def->unique_part_count;
else
@@ -559,8 +582,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
static inline int
-tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
- uint32_t part_count, struct key_def *key_def)
+tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
+ struct key_def *key_def)
+{
+ return tuple_compare_slowpath_hinted
+ <is_nullable, has_optional_parts, has_json_paths>
+ (tuple_a, HINT_NONE, tuple_b, HINT_NONE, key_def);
+}
+
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
+ hint_t tuple_hint, const char *key, uint32_t part_count,
+ hint_t key_hint, struct key_def *key_def)
{
assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
@@ -568,6 +602,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
assert(has_optional_parts == key_def->has_optional_parts);
assert(key != NULL || part_count == 0);
assert(part_count <= key_def->part_count);
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
struct key_part *part = key_def->parts;
struct tuple_format *format = tuple_format(tuple);
const char *tuple_raw = tuple_data(tuple);
@@ -603,7 +640,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
}
struct key_part *end = part + part_count;
- int rc;
for (; part < end; ++part, mp_next(&key)) {
const char *field;
if (has_json_paths) {
@@ -642,6 +678,16 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
return 0;
}
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
+ uint32_t part_count, struct key_def *key_def)
+{
+ return tuple_compare_with_key_slowpath_hinted
+ <is_nullable, has_optional_parts, has_json_paths>
+ (tuple, HINT_NONE, key, part_count, HINT_NONE, key_def);
+}
+
template<bool is_nullable>
static inline int
key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
@@ -700,13 +746,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
template<bool is_nullable, bool has_optional_parts>
static inline int
-tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
- uint32_t part_count, struct key_def *key_def)
+tuple_compare_with_key_sequential_hinted(const struct tuple *tuple,
+ hint_t tuple_hint, const char *key, uint32_t part_count,
+ hint_t key_hint, struct key_def *key_def)
{
assert(!has_optional_parts || is_nullable);
assert(key_def_is_sequential(key_def));
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
const char *tuple_key = tuple_data(tuple);
uint32_t field_count = mp_decode_array(&tuple_key);
uint32_t cmp_part_count;
@@ -716,8 +766,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
assert(field_count >= part_count);
cmp_part_count = part_count;
}
- int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
- key_def);
+ rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
+ key_def);
if (!has_optional_parts || rc != 0)
return rc;
/*
@@ -739,6 +789,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
return 0;
}
+template<bool is_nullable, bool has_optional_parts>
+static inline int
+tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
+ uint32_t part_count, struct key_def *key_def)
+{
+ return tuple_compare_with_key_sequential_hinted
+ <is_nullable, has_optional_parts>
+ (tuple, HINT_NONE, key, part_count, HINT_NONE, key_def);
+}
+
int
key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
{
@@ -759,13 +819,17 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
template <bool is_nullable, bool has_optional_parts>
static int
-tuple_compare_sequential(const struct tuple *tuple_a,
- const struct tuple *tuple_b, key_def *key_def)
+tuple_compare_sequential_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+ const struct tuple *tuple_b, hint_t tuple_b_hint,
+ struct key_def *key_def)
{
assert(!has_optional_parts || is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
assert(key_def_is_sequential(key_def));
assert(is_nullable == key_def->is_nullable);
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
const char *key_a = tuple_data(tuple_a);
uint32_t fc_a = mp_decode_array(&key_a);
const char *key_b = tuple_data(tuple_b);
@@ -779,7 +843,6 @@ tuple_compare_sequential(const struct tuple *tuple_a,
bool was_null_met = false;
struct key_part *part = key_def->parts;
struct key_part *end = part + key_def->unique_part_count;
- int rc;
uint32_t i = 0;
for (; part < end; ++part, ++i) {
enum mp_type a_type, b_type;
@@ -826,6 +889,16 @@ tuple_compare_sequential(const struct tuple *tuple_a,
return 0;
}
+template <bool is_nullable, bool has_optional_parts>
+static int
+tuple_compare_sequential(const struct tuple *tuple_a,
+ const struct tuple *tuple_b, struct key_def *key_def)
+{
+ return tuple_compare_sequential_hinted
+ <is_nullable, has_optional_parts>
+ (tuple_a, HINT_NONE, tuple_b, HINT_NONE, key_def);
+}
+
template <int TYPE>
static inline int
field_compare(const char **field_a, const char **field_b);
@@ -956,6 +1029,18 @@ struct TupleCompare
compare(tuple_a, tuple_b, format_a,
format_b, field_a, field_b);
}
+
+ static int compare_hinted(const struct tuple *tuple_a,
+ hint_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ hint_t tuple_b_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple_a, tuple_b, key_def);
+ }
};
template <int TYPE, int ...MORE_TYPES>
@@ -973,15 +1058,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> {
return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b,
format_a, format_b, field_a, field_b);
}
+
+ static int compare_hinted(const struct tuple *tuple_a,
+ hint_t tuple_a_hint,
+ const struct tuple *tuple_b,
+ hint_t tuple_b_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple_a, tuple_b, key_def);
+ }
};
} /* end of anonymous namespace */
struct comparator_signature {
tuple_compare_t f;
+ tuple_compare_hinted_t f_hinted;
uint32_t p[64];
};
#define COMPARATOR(...) \
- { TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } },
+ { TupleCompare<__VA_ARGS__>::compare, \
+ TupleCompare<__VA_ARGS__>::compare_hinted, \
+ { __VA_ARGS__, UINT32_MAX } },
/**
* field1 no, field1 type, field2 no, field2 type, ...
@@ -1133,6 +1233,17 @@ struct TupleCompareWithKey
compare(tuple, key, part_count,
key_def, format, field);
}
+
+ static int
+ compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+ const char *key, uint32_t part_count, hint_t key_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple, key, part_count, key_def);
+ }
};
template <int TYPE, int ...MORE_TYPES>
@@ -1153,6 +1264,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
compare(tuple, key, part_count,
key_def, format, field);
}
+
+ static int
+ compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+ const char *key, uint32_t part_count, hint_t key_hint,
+ struct key_def *key_def)
+ {
+ int rc = hint_cmp(tuple_hint, key_hint);
+ if (rc != 0)
+ return rc;
+ return compare(tuple, key, part_count, key_def);
+ }
};
} /* end of anonymous namespace */
@@ -1160,11 +1282,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
struct comparator_with_key_signature
{
tuple_compare_with_key_t f;
+ tuple_compare_with_key_hinted_t f_hinted;
uint32_t p[64];
};
#define KEY_COMPARATOR(...) \
- { TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
+ { TupleCompareWithKey<0, __VA_ARGS__>::compare, \
+ TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \
+ { __VA_ARGS__ } },
static const comparator_with_key_signature cmp_wk_arr[] = {
KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
@@ -1186,6 +1311,356 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
/* }}} tuple_compare_with_key */
+/* {{{ tuple_hint */
+
+/**
+ * A comparison hint is an unsigned integer number that has
+ * the following layout:
+ *
+ * [ class | value ]
+ * <-- HINT_CLASS_BITS --> <-- HINT_VALUE_BITS -->
+ * <----------------- HINT_BITS ----------------->
+ *
+ * For simplicity we construct it using the first key part only;
+ * other key parts don't participate in hint construction. As a
+ * consequence, tuple hints are useless if the first key part
+ * doesn't differ among indexed tuples.
+ *
+ * Hint class stores one of mp_class enum values corresponding
+ * to the field type. We store it in upper bits of a hint so
+ * as to guarantee that tuple fields that have different types
+ * are compared correctly in a scalar index. We use the same
+ * layout for indexes of base types too so that we don't need
+ * to recalculate hints when altering an index from scalar to
+ * one of base types or vice versa without index rebuild.
+ *
+ * Hint value is stored in lower bits of a hint and chosen so
+ * as to satisfy the hint property (see the comment to hint_t).
+ * Also, hint values must be compatible among all kinds of numbers
+ * (integer, unsigned, floating point) so that we can alter an
+ * index between any two of them without touching the hints.
+ *
+ * The value depends on the field type:
+ *
+ * - For an integer field, the hint value equals the number
+ * stored in the field minus the minimal (negative) integer
+ * number that fits in a hint. If the number is too large
+ * or too small to fit in a hint, we use the max or the min
+ * number that fits. This guarantees correct order for all
+ * positive and negative integers.
+ *
+ * - For a field storing a floating point number, we use
+ * the hint computed for the integral part. This ensures
+ * compatibility between floating point and integer field
+ * hints.
+ *
+ * - For a boolean field, the hint value is simply 0 or 1.
+ *
+ * - For a string field, we take the first few characters that
+ * fit in a hint and shift them in such a way that comparing
+ * them is equivalent to strcmp(). If there's a collation,
+ * we use the sort key instead of the string (see coll->hint).
+ *
+ * - For a field containing NULL, the value is 0, and we rely on
+ * mp_class comparison rules for arranging nullable fields.
+ */
+#define HINT_BITS (sizeof(hint_t) * CHAR_BIT)
+#define HINT_CLASS_BITS 4
+#define HINT_VALUE_BITS (HINT_BITS - HINT_CLASS_BITS)
+
+/** Number of bytes that fit in a hint value. */
+#define HINT_VALUE_BYTES (HINT_VALUE_BITS / CHAR_BIT)
+
+/** Max unsigned integer that can be stored in a hint value. */
+#define HINT_VALUE_MAX ((1ULL << HINT_VALUE_BITS) - 1)
+
+/**
+ * Max and min signed integer numbers that fit in a hint value.
+ * For numbers > MAX and < MIN we store MAX and MIN, respectively.
+ */
+#define HINT_VALUE_INT_MAX ((1LL << (HINT_VALUE_BITS - 1)) - 1)
+#define HINT_VALUE_INT_MIN (-(1LL << (HINT_VALUE_BITS - 1)))
+
+/**
+ * Max and min floating point numbers whose integral parts fit
+ * in a hint value. Note, we can't compare a floating point number
+ * with HINT_VALUE_INT_{MIN,MAX} because of rounding errors.
+ */
+#define HINT_VALUE_DOUBLE_MAX (exp2(HINT_VALUE_BITS - 1) - 1)
+#define HINT_VALUE_DOUBLE_MIN (-exp2(HINT_VALUE_BITS - 1))
+
+/*
+ * HINT_CLASS_BITS should be big enough to store any mp_class value.
+ * Note, ((1 << HINT_CLASS_BITS) - 1) is reserved for HINT_NONE.
+ */
+static_assert(mp_class_max < (1 << HINT_CLASS_BITS) - 1,
+ "mp_class must fit in tuple hint");
+
+static inline hint_t
+hint_create(enum mp_class c, uint64_t val)
+{
+ assert((val >> HINT_VALUE_BITS) == 0);
+ return (hint_t)(((uint64_t)c << HINT_VALUE_BITS) | val);
+}
+
+static inline hint_t
+hint_nil(void)
+{
+ return hint_create(MP_CLASS_NIL, 0);
+}
+
+static inline hint_t
+hint_bool(bool b)
+{
+ return hint_create(MP_CLASS_BOOL, b ? 1 : 0);
+}
+
+static inline hint_t
+hint_uint(uint64_t u)
+{
+ uint64_t val = (u >= (uint64_t)HINT_VALUE_INT_MAX ?
+ HINT_VALUE_MAX : u - HINT_VALUE_INT_MIN);
+ return hint_create(MP_CLASS_NUMBER, val);
+}
+
+static inline hint_t
+hint_int(int64_t i)
+{
+ assert(i < 0);
+ uint64_t val = (i <= HINT_VALUE_INT_MIN ? 0 : i - HINT_VALUE_INT_MIN);
+ return hint_create(MP_CLASS_NUMBER, val);
+}
+
+static inline hint_t
+hint_double(double d)
+{
+ if (!isfinite(d))
+ return HINT_NONE;
+
+ uint64_t val;
+ if (d >= HINT_VALUE_DOUBLE_MAX)
+ val = HINT_VALUE_MAX;
+ else if (d <= HINT_VALUE_DOUBLE_MIN)
+ val = 0;
+ else
+ val = (int64_t)d - HINT_VALUE_INT_MIN;
+
+ return hint_create(MP_CLASS_NUMBER, val);
+}
+
+static inline uint64_t
+hint_str_raw(const char *s, uint32_t len)
+{
+ len = MIN(len, HINT_VALUE_BYTES);
+ uint64_t val = 0;
+ for (uint32_t i = 0; i < len; i++) {
+ val <<= CHAR_BIT;
+ val |= (unsigned char)s[i];
+ }
+ val <<= CHAR_BIT * (HINT_VALUE_BYTES - len);
+ return val;
+}
+
+static inline hint_t
+hint_str(const char *s, uint32_t len)
+{
+ uint64_t val = hint_str_raw(s, len);
+ return hint_create(MP_CLASS_STR, val);
+}
+
+static inline hint_t
+hint_str_coll(const char *s, uint32_t len, struct coll *coll)
+{
+ char buf[HINT_VALUE_BYTES];
+ uint32_t buf_len = coll->hint(s, len, buf, sizeof(buf), coll);
+ uint64_t val = hint_str_raw(buf, buf_len);
+ return hint_create(MP_CLASS_STR, val);
+}
+
+static inline hint_t
+hint_bin(const char *s, uint32_t len)
+{
+ uint64_t val = hint_str_raw(s, len);
+ return hint_create(MP_CLASS_BIN, val);
+}
+
+static inline hint_t
+field_hint_boolean(const char *field)
+{
+ assert(mp_typeof(*field) == MP_BOOL);
+ return hint_bool(mp_decode_bool(&field));
+}
+
+static inline hint_t
+field_hint_unsigned(const char *field)
+{
+ assert(mp_typeof(*field) == MP_UINT);
+ return hint_uint(mp_decode_uint(&field));
+}
+
+static inline hint_t
+field_hint_integer(const char *field)
+{
+ switch (mp_typeof(*field)) {
+ case MP_UINT:
+ return hint_uint(mp_decode_uint(&field));
+ case MP_INT:
+ return hint_int(mp_decode_int(&field));
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+static inline hint_t
+field_hint_number(const char *field)
+{
+ switch (mp_typeof(*field)) {
+ case MP_UINT:
+ return hint_uint(mp_decode_uint(&field));
+ case MP_INT:
+ return hint_int(mp_decode_int(&field));
+ case MP_FLOAT:
+ return hint_double(mp_decode_float(&field));
+ case MP_DOUBLE:
+ return hint_double(mp_decode_double(&field));
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+static inline hint_t
+field_hint_string(const char *field, struct coll *coll)
+{
+ assert(mp_typeof(*field) == MP_STR);
+ uint32_t len = mp_decode_strl(&field);
+ return coll == NULL ? hint_str(field, len) :
+ hint_str_coll(field, len, coll);
+}
+
+static inline hint_t
+field_hint_scalar(const char *field, struct coll *coll)
+{
+ uint32_t len;
+ switch(mp_typeof(*field)) {
+ case MP_BOOL:
+ return hint_bool(mp_decode_bool(&field));
+ case MP_UINT:
+ return hint_uint(mp_decode_uint(&field));
+ case MP_INT:
+ return hint_int(mp_decode_int(&field));
+ case MP_FLOAT:
+ return hint_double(mp_decode_float(&field));
+ case MP_DOUBLE:
+ return hint_double(mp_decode_double(&field));
+ case MP_STR:
+ len = mp_decode_strl(&field);
+ return coll == NULL ? hint_str(field, len) :
+ hint_str_coll(field, len, coll);
+ case MP_BIN:
+ len = mp_decode_binl(&field);
+ return hint_bin(field, len);
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable>
+static inline hint_t
+field_hint(const char *field, struct coll *coll)
+{
+ if (is_nullable && mp_typeof(*field) == MP_NIL)
+ return hint_nil();
+ switch (type) {
+ case FIELD_TYPE_BOOLEAN:
+ return field_hint_boolean(field);
+ case FIELD_TYPE_UNSIGNED:
+ return field_hint_unsigned(field);
+ case FIELD_TYPE_INTEGER:
+ return field_hint_integer(field);
+ case FIELD_TYPE_NUMBER:
+ return field_hint_number(field);
+ case FIELD_TYPE_STRING:
+ return field_hint_string(field, coll);
+ case FIELD_TYPE_SCALAR:
+ return field_hint_scalar(field, coll);
+ default:
+ unreachable();
+ }
+ return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable>
+static hint_t
+key_hint(const char *key, uint32_t part_count, struct key_def *key_def)
+{
+ if (part_count == 0)
+ return HINT_NONE;
+ return field_hint<type, is_nullable>(key, key_def->parts->coll);
+}
+
+template <enum field_type type, bool is_nullable>
+static hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+ const char *field = tuple_field_by_part(tuple, key_def->parts);
+ if (is_nullable && field == NULL)
+ return hint_nil();
+ return field_hint<type, is_nullable>(field, key_def->parts->coll);
+}
+
+template<enum field_type type, bool is_nullable>
+static void
+key_def_set_hint_func(struct key_def *def)
+{
+ def->key_hint = key_hint<type, is_nullable>;
+ def->tuple_hint = tuple_hint<type, is_nullable>;
+}
+
+template<enum field_type type>
+static void
+key_def_set_hint_func(struct key_def *def)
+{
+ if (key_part_is_nullable(def->parts))
+ key_def_set_hint_func<type, true>(def);
+ else
+ key_def_set_hint_func<type, false>(def);
+}
+
+static void
+key_def_set_hint_func(struct key_def *def)
+{
+ switch (def->parts->type) {
+ case FIELD_TYPE_BOOLEAN:
+ key_def_set_hint_func<FIELD_TYPE_BOOLEAN>(def);
+ break;
+ case FIELD_TYPE_UNSIGNED:
+ key_def_set_hint_func<FIELD_TYPE_UNSIGNED>(def);
+ break;
+ case FIELD_TYPE_INTEGER:
+ key_def_set_hint_func<FIELD_TYPE_INTEGER>(def);
+ break;
+ case FIELD_TYPE_NUMBER:
+ key_def_set_hint_func<FIELD_TYPE_NUMBER>(def);
+ break;
+ case FIELD_TYPE_STRING:
+ key_def_set_hint_func<FIELD_TYPE_STRING>(def);
+ break;
+ case FIELD_TYPE_SCALAR:
+ key_def_set_hint_func<FIELD_TYPE_SCALAR>(def);
+ break;
+ default:
+ /* Invalid key definition. */
+ def->key_hint = NULL;
+ def->tuple_hint = NULL;
+ break;
+ }
+}
+
+/* }}} tuple_hint */
+
static void
key_def_set_compare_func_fast(struct key_def *def)
{
@@ -1195,7 +1670,9 @@ key_def_set_compare_func_fast(struct key_def *def)
assert(!key_def_has_collation(def));
tuple_compare_t cmp = NULL;
+ tuple_compare_hinted_t cmp_hinted = NULL;
tuple_compare_with_key_t cmp_wk = NULL;
+ tuple_compare_with_key_hinted_t cmp_wk_hinted = NULL;
bool is_sequential = key_def_is_sequential(def);
/*
@@ -1210,6 +1687,7 @@ key_def_set_compare_func_fast(struct key_def *def)
break;
if (i == def->part_count && cmp_arr[k].p[i * 2] == UINT32_MAX) {
cmp = cmp_arr[k].f;
+ cmp_hinted = cmp_arr[k].f_hinted;
break;
}
}
@@ -1222,6 +1700,7 @@ key_def_set_compare_func_fast(struct key_def *def)
}
if (i == def->part_count) {
cmp_wk = cmp_wk_arr[k].f;
+ cmp_wk_hinted = cmp_wk_arr[k].f_hinted;
break;
}
}
@@ -1229,15 +1708,23 @@ key_def_set_compare_func_fast(struct key_def *def)
cmp = is_sequential ?
tuple_compare_sequential<false, false> :
tuple_compare_slowpath<false, false, false>;
+ cmp_hinted = is_sequential ?
+ tuple_compare_sequential_hinted<false, false> :
+ tuple_compare_slowpath_hinted<false, false, false>;
}
if (cmp_wk == NULL) {
cmp_wk = is_sequential ?
tuple_compare_with_key_sequential<false, false> :
tuple_compare_with_key_slowpath<false, false, false>;
+ cmp_wk_hinted = is_sequential ?
+ tuple_compare_with_key_sequential_hinted<false, false> :
+ tuple_compare_with_key_slowpath_hinted<false, false, false>;
}
def->tuple_compare = cmp;
+ def->tuple_compare_hinted = cmp_hinted;
def->tuple_compare_with_key = cmp_wk;
+ def->tuple_compare_with_key_hinted = cmp_wk_hinted;
}
template<bool is_nullable, bool has_optional_parts>
@@ -1248,13 +1735,23 @@ key_def_set_compare_func_plain(struct key_def *def)
if (key_def_is_sequential(def)) {
def->tuple_compare = tuple_compare_sequential
<is_nullable, has_optional_parts>;
+ def->tuple_compare_hinted = tuple_compare_sequential_hinted
+ <is_nullable, has_optional_parts>;
def->tuple_compare_with_key = tuple_compare_with_key_sequential
<is_nullable, has_optional_parts>;
+ def->tuple_compare_with_key_hinted =
+ tuple_compare_with_key_sequential_hinted
+ <is_nullable, has_optional_parts>;
} else {
def->tuple_compare = tuple_compare_slowpath
<is_nullable, has_optional_parts, false>;
+ def->tuple_compare_hinted = tuple_compare_slowpath_hinted
+ <is_nullable, has_optional_parts, false>;
def->tuple_compare_with_key = tuple_compare_with_key_slowpath
<is_nullable, has_optional_parts, false>;
+ def->tuple_compare_with_key_hinted =
+ tuple_compare_with_key_slowpath_hinted
+ <is_nullable, has_optional_parts, false>;
}
}
@@ -1265,8 +1762,13 @@ key_def_set_compare_func_json(struct key_def *def)
assert(def->has_json_paths);
def->tuple_compare = tuple_compare_slowpath
<is_nullable, has_optional_parts, true>;
+ def->tuple_compare_hinted = tuple_compare_slowpath_hinted
+ <is_nullable, has_optional_parts, true>;
def->tuple_compare_with_key = tuple_compare_with_key_slowpath
<is_nullable, has_optional_parts, true>;
+ def->tuple_compare_with_key_hinted =
+ tuple_compare_with_key_slowpath_hinted
+ <is_nullable, has_optional_parts, true>;
}
void
@@ -1294,4 +1796,5 @@ key_def_set_compare_func(struct key_def *def)
key_def_set_compare_func_json<false, false>(def);
}
}
+ key_def_set_hint_func(def);
}
diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
index afc15e80..b83f0fdc 100644
--- a/src/lib/coll/coll.c
+++ b/src/lib/coll/coll.c
@@ -133,6 +133,30 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
return s_len;
}
+static size_t
+coll_icu_hint(const char *s, size_t s_len, char *buf, size_t buf_len,
+ struct coll *coll)
+{
+ assert(coll->type == COLL_TYPE_ICU);
+ UCharIterator itr;
+ uiter_setUTF8(&itr, s, s_len);
+ uint32_t state[2] = {0, 0};
+ UErrorCode status = U_ZERO_ERROR;
+ return ucol_nextSortKeyPart(coll->collator, &itr, state,
+ (uint8_t *)buf, buf_len, &status);
+}
+
+static size_t
+coll_bin_hint(const char *s, size_t s_len, char *buf, size_t buf_len,
+ struct coll *coll)
+{
+ (void)coll;
+ assert(coll->type == COLL_TYPE_BINARY);
+ size_t len = MIN(s_len, buf_len);
+ memcpy(buf, s, len);
+ return len;
+}
+
/**
* Set up ICU collator and init cmp and hash members of collation.
* @param coll Collation to set up.
@@ -242,6 +266,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
}
coll->cmp = coll_icu_cmp;
coll->hash = coll_icu_hash;
+ coll->hint = coll_icu_hint;
return 0;
}
@@ -356,6 +381,7 @@ coll_new(const struct coll_def *def)
coll->collator = NULL;
coll->cmp = coll_bin_cmp;
coll->hash = coll_bin_hash;
+ coll->hint = coll_bin_hint;
break;
default:
unreachable();
diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h
index 8c9f9429..c505804f 100644
--- a/src/lib/coll/coll.h
+++ b/src/lib/coll/coll.h
@@ -48,6 +48,9 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
uint32_t *pcarry, struct coll *coll);
+typedef size_t (*coll_hint_f)(const char *s, size_t s_len, char *buf,
+ size_t buf_len, struct coll *coll);
+
struct UCollator;
/**
@@ -61,7 +64,16 @@ struct coll {
struct UCollator *collator;
/** String comparator. */
coll_cmp_f cmp;
+ /** String hash function. */
coll_hash_f hash;
+ /**
+ * String comparison hint.
+ *
+ * This function copies a sort key for a string to
+ * the given buffer and returns the number of bytes
+ * copied. Sort keys may be compared using strcmp().
+ */
+ coll_hint_f hint;
/** Reference counter. */
int refs;
/**
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-03-20 18:08 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
2019-03-14 7:04 ` Vladimir Davydov
2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov
2019-03-15 10:55 ` Kirill Shcherbatov
2019-03-19 19:38 ` Vladimir Davydov
2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
2019-03-14 8:19 ` Vladimir Davydov
2019-03-15 10:20 ` [tarantool-patches] " Kirill Shcherbatov
2019-03-20 18:08 ` Vladimir Davydov
2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox