Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH v6 0/3] box: introduce hint option for memtx tree index
@ 2019-03-13 12:15 Kirill Shcherbatov
  2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

Reworked memtx tree to use 'tuple hints'.
Introduced special functions for retrieve tuple hints for a particular key_def.
Hint is an integer that can be used for tuple comparison optimization:
if a hint of one tuple is less than a hint of another then the first
tuple is definitely less than the second; only if hints are equal
tuple_compare must be called for getting comparison result.

Hints are only useful when:

* they are precalculated and stored along with the tuple;
  calculation is not cheap (cheaper than tuple_compare btw) but
  precalculated hints allow to compare tuples without even fetching
  tuple data.
* first part of key_def is 'string', 'unsigned' or 'integer'
* since hint is calculated using only the first part of key_def
  (and only first several characters if it is a string) this part
  must be different with high probability for every tuple pair.

Enabled hint option improve performance on average by 15%; Select operations
are significantly accelerated (there are scenarios in which the difference
reaches 200-250%).

Also appended multikey index prototype. I am going to try to rework field_map
initialization and resend last letter a bit later.

Changes in version 6:
  -- Changed hint format: now the first two bytes are reserved for type
  -- Hints for SCALAR type
  -- All compare routins now have two versions: hinted and non-hinted

Changes in version 5:
  -- Code rewritten without classes and macro definitions using vtabs.
  -- Appended multikey index prototype.

Changes in version 4:
  -- Code rewritten in C++ with classes. This perform a better maintainability
     in future.
  -- New hints for number and boolean types. Memtx Tree is always hinted now.
  -- INVALID_HINT marker. We need it because double have strange types
     NaN and so on that musn't be a problem of hints business.
  -- After part of code was merged, rebased patch.

Changes in version 3:
  -- Better-structured code
  -- Refactored all memtx indexes to use shared mempool of default size
  -- Refactored all memtx indexes to hide implementation details from headers
  -- Moved all hints-related code to corresponding module
  -- Better types names, better comments
  -- Fixed potential bug with iterators: introduce MEMTX_TREE_IDENTICAL macro
  -- Fix inaccurate MEMTX_TREE_ELEM_SET usage in memtx_tree_index_build_next
  -- Changed approach to calculate string hints
  -- Introduce separate hint for binary collation type

Changes in version 2:
  -- Splitted patch to parts in other way to decrease diff
  -- Hints are not index option anymore, but default where possible
  -- Removed hints for numeric types

v5: https://www.freelists.org/post/tarantool-patches/PATCH-v5-04-box-introduce-hint-option-for-memtx-tree-index
v4: https://www.freelists.org/post/tarantool-patches/PATCH-v4-07-box-introduce-hint-option-for-memtx-tree-index
v3: https://www.freelists.org/post/tarantool-patches/PATCH-v3-07-box-introduce-hint-option-for-memtx-tree-index
v2: https://www.freelists.org/post/tarantool-patches/PATCH-v2-04-box-introduce-hint-option-for-memtx-tree-index
v1: https://www.freelists.org/post/tarantool-patches/PATCH-v1-04-box-introduce-hint-option-for-memtx-tree-index

http://github.com/tarantool/tarantool/tree/kshch/gh-3961-tuple-hints
https://github.com/tarantool/tarantool/issues/3961

Kirill Shcherbatov (3):
  box: refactor key_def_set_compare_func routine
  memtx: introduce tuple compare hint
  box: introduce multikey indexes

 src/box/key_def.c         |  15 +
 src/box/key_def.h         | 100 ++++++
 src/box/memtx_tree.c      | 217 +++++++++++-
 src/box/tuple.c           |   8 +-
 src/box/tuple.h           | 122 ++++++-
 src/box/tuple_compare.cc  | 705 +++++++++++++++++++++++++++++++++++---
 src/box/tuple_format.c    | 120 +++++--
 src/lib/coll/coll.c       |  33 ++
 src/lib/coll/coll.h       |   4 +
 test/engine/json.result   |  80 ++++-
 test/engine/json.test.lua |  20 ++
 11 files changed, 1330 insertions(+), 94 deletions(-)

-- 
2.21.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
  2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
@ 2019-03-13 12:15 ` Kirill Shcherbatov
  2019-03-14  7:04   ` Vladimir Davydov
  2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
  2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov
  2 siblings, 1 reply; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

Refactored tuple_compare_create and tuple_compare_with_key_create
routines to set corresponding key_def field by it's own instead of
returning pointer to function. This is required as in further
patches we should set two key_def pointers: for hinted and for
non-hinted routine version.

Needed for #3961
---
 src/box/tuple_compare.cc | 62 ++++++++++++++++++++++++----------------
 1 file changed, 38 insertions(+), 24 deletions(-)

diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index cf4519ccb..1bcff2ca3 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -1049,21 +1049,26 @@ static const tuple_compare_t compare_slowpath_funcs[] = {
 	tuple_compare_slowpath<true, true, true>
 };
 
-static tuple_compare_t
-tuple_compare_create(const struct key_def *def)
+void
+key_def_set_tuple_compare(struct key_def *def)
 {
 	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
 			   2 * (def->has_optional_parts ? 1 : 0) +
 			   4 * (def->has_json_paths ? 1 : 0);
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
-			if (def->has_optional_parts)
-				return tuple_compare_sequential<true, true>;
-			else
-				return tuple_compare_sequential<true, false>;
+			if (def->has_optional_parts) {
+				def->tuple_compare =
+					tuple_compare_sequential<true, true>;
+			} else {
+				def->tuple_compare =
+					tuple_compare_sequential<true, false>;
+			}
 		} else {
-			return compare_slowpath_funcs[cmp_func_idx];
+			def->tuple_compare =
+				compare_slowpath_funcs[cmp_func_idx];
 		}
+		return;
 	}
 	assert(! def->has_optional_parts);
 	if (!key_def_has_collation(def) && !def->has_json_paths) {
@@ -1078,13 +1083,15 @@ tuple_compare_create(const struct key_def *def)
 				    cmp_arr[k].p[i * 2 + 1])
 					break;
 			if (i == def->part_count &&
-			    cmp_arr[k].p[i * 2] == UINT32_MAX)
-				return cmp_arr[k].f;
+			    cmp_arr[k].p[i * 2] == UINT32_MAX) {
+				def->tuple_compare = cmp_arr[k].f;
+				return;
+			}
 		}
 	}
-	return key_def_is_sequential(def) ?
-	       tuple_compare_sequential<false, false> :
-	       compare_slowpath_funcs[cmp_func_idx];
+	def->tuple_compare = key_def_is_sequential(def) ?
+			     tuple_compare_sequential<false, false> :
+			     compare_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare */
@@ -1277,8 +1284,8 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
 	tuple_compare_with_key_slowpath<true, true, true>
 };
 
-static tuple_compare_with_key_t
-tuple_compare_with_key_create(const struct key_def *def)
+void
+key_def_set_tuple_compare_with_key(struct key_def *def)
 {
 	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
 			   2 * (def->has_optional_parts ? 1 : 0) +
@@ -1286,15 +1293,19 @@ tuple_compare_with_key_create(const struct key_def *def)
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts) {
-				return tuple_compare_with_key_sequential<true,
+				def->tuple_compare_with_key =
+					tuple_compare_with_key_sequential<true,
 									 true>;
 			} else {
-				return tuple_compare_with_key_sequential<true,
+				def->tuple_compare_with_key =
+					tuple_compare_with_key_sequential<true,
 									 false>;
 			}
 		} else {
-			return compare_with_key_slowpath_funcs[cmp_func_idx];
+			def->tuple_compare_with_key =
+				compare_with_key_slowpath_funcs[cmp_func_idx];
 		}
+		return;
 	}
 	assert(! def->has_optional_parts);
 	if (!key_def_has_collation(def) && !def->has_json_paths) {
@@ -1312,13 +1323,16 @@ tuple_compare_with_key_create(const struct key_def *def)
 					break;
 				}
 			}
-			if (i == def->part_count)
-				return cmp_wk_arr[k].f;
+			if (i == def->part_count) {
+				def->tuple_compare_with_key = cmp_wk_arr[k].f;
+				return;
+			}
 		}
 	}
-	return key_def_is_sequential(def) ?
-	       tuple_compare_with_key_sequential<false, false> :
-	       compare_with_key_slowpath_funcs[cmp_func_idx];
+	def->tuple_compare_with_key =
+		key_def_is_sequential(def) ?
+		tuple_compare_with_key_sequential<false, false> :
+		compare_with_key_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare_with_key */
@@ -1326,6 +1340,6 @@ tuple_compare_with_key_create(const struct key_def *def)
 void
 key_def_set_compare_func(struct key_def *def)
 {
-	def->tuple_compare = tuple_compare_create(def);
-	def->tuple_compare_with_key = tuple_compare_with_key_create(def);
+	key_def_set_tuple_compare(def);
+	key_def_set_tuple_compare_with_key(def);
 }
-- 
2.21.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v6 2/3] memtx: introduce tuple compare hint
  2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
  2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
@ 2019-03-13 12:15 ` Kirill Shcherbatov
  2019-03-14  8:19   ` Vladimir Davydov
  2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov
  2 siblings, 1 reply; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

Implement functions for retrieving tuple hints for a particular
key_def. Hint is an integer that can be used for tuple comparison
optimization: if a hint of one tuple is less than a hint of another
then the first tuple is definitely less than the second; only if
hints are equal tuple_compare must be called for getting comparison
result.

Hints are calculated using only the first part of key_def.

Hints are only useful when:
 * they are precalculated and stored along with the tuple;
calculation is not cheap (cheaper than tuple_compare btw) but
precalculated hints allow to compare tuples without even fetching
tuple data.
 * first part of key_def is 'string'(for now)
 * since hint is calculated using only the first part of key_def
(and only first several characters if it is a string) this part
must be different with high probability for every tuple pair.

Closes #3961
---
 src/box/key_def.h        |  95 +++++++
 src/box/memtx_tree.c     |  46 ++-
 src/box/tuple_compare.cc | 584 +++++++++++++++++++++++++++++++++++++--
 src/lib/coll/coll.c      |  33 +++
 src/lib/coll/coll.h      |   4 +
 5 files changed, 730 insertions(+), 32 deletions(-)

diff --git a/src/box/key_def.h b/src/box/key_def.h
index dd62f6a35..0d8f3112e 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -137,6 +137,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
 typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
 			       const struct tuple *tuple_b,
 			       struct key_def *key_def);
+/** @copydoc tuple_compare_with_key_hinted() */
+typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
+					       uint64_t tuple_hint,
+					       const char *key,
+					       uint32_t part_count,
+					       uint64_t key_hint,
+					       struct key_def *key_def);
+/** @copydoc tuple_compare_hinted() */
+typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a,
+				      uint64_t tuple_a_hint,
+				      const struct tuple *tuple_b,
+				      uint64_t tuple_b_hint,
+				      struct key_def *key_def);
 /** @copydoc tuple_extract_key() */
 typedef char *(*tuple_extract_key_t)(const struct tuple *tuple,
 				     struct key_def *key_def,
@@ -152,6 +165,11 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple,
 /** @copydoc key_hash() */
 typedef uint32_t (*key_hash_t)(const char *key,
 				struct key_def *key_def);
+/** @copydoc tuple_hint() */
+typedef uint64_t (*tuple_hint_t)(const struct tuple *tuple,
+				  struct key_def *key_def);
+/** @copydoc key_hint() */
+typedef uint64_t (*key_hint_t)(const char *key, struct key_def *key_def);
 
 /* Definition of a multipart key. */
 struct key_def {
@@ -159,6 +177,10 @@ struct key_def {
 	tuple_compare_t tuple_compare;
 	/** @see tuple_compare_with_key() */
 	tuple_compare_with_key_t tuple_compare_with_key;
+	/** @see tuple_compare_with_key_hinted() */
+	tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
+	/** @see tuple_compare_hinted() */
+	tuple_compare_hinted_t tuple_compare_hinted;
 	/** @see tuple_extract_key() */
 	tuple_extract_key_t tuple_extract_key;
 	/** @see tuple_extract_key_raw() */
@@ -167,6 +189,10 @@ struct key_def {
 	tuple_hash_t tuple_hash;
 	/** @see key_hash() */
 	key_hash_t key_hash;
+	/** @see tuple_hint() */
+	tuple_hint_t tuple_hint;
+	/** @see key_hint() */
+	key_hint_t key_hint;
 	/**
 	 * Minimal part count which always is unique. For example,
 	 * if a secondary index is unique, then
@@ -571,6 +597,51 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
 	return key_def->tuple_compare_with_key(tuple, key, part_count, key_def);
 }
 
+/**
+ * Compare tuples using the key definition and comparison hints.
+ * @param tuple_a First tuple.
+ * @param tuple_a_hint Comparison hint is calculated for the
+ *                     @a tuple_a.
+ * @param tuple_b Second tuple.
+ * @param tuple_b_hint Comparison hint is calculated for the
+ *                     @a tuple_b.
+ * @param key_def Key definition.
+ * @retval 0  if key_fields(tuple_a) == key_fields(tuple_b)
+ * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b)
+ * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b)
+ */
+static inline int
+tuple_compare_hinted(const struct tuple *tuple_a, uint64_t tuple_a_hint,
+		     const struct tuple *tuple_b, uint64_t tuple_b_hint,
+		     struct key_def *key_def)
+{
+	return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
+					     tuple_b_hint, key_def);
+}
+
+/**
+ * Compare tuple with key using the key definition and
+ * comparison hints.
+ * @param tuple tuple
+ * @param tuple_hint Comparison hint is calculated for @a tuple.
+ * @param key key parts without MessagePack array header
+ * @param part_count the number of parts in @a key
+ * @param key_hint t Comparison key kent is calculated for @a key.
+ * @param key_def key definition
+ * @retval 0  if key_fields(tuple) == parts(key)
+ * @retval <0 if key_fields(tuple) < parts(key)
+ * @retval >0 if key_fields(tuple) > parts(key)
+ */
+static inline int
+tuple_compare_with_key_hinted(const struct tuple *tuple, uint64_t tuple_hint,
+			      const char *key, uint32_t part_count,
+			      uint64_t key_hint, struct key_def *key_def)
+{
+	return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
+						      part_count, key_hint,
+						      key_def);
+}
+
 /**
  * Compute hash of a tuple field.
  * @param ph1 - pointer to running hash
@@ -624,6 +695,30 @@ key_hash(const char *key, struct key_def *key_def)
 	return key_def->key_hash(key, key_def);
 }
 
+ /*
+ * Get a comparison hint for a @a tuple.
+ * @param tuple - tuple to get uint64_t of.
+ * @param key_def - key_def that defines which comparison is used.
+ * @return the comparison auxiliary information.
+ */
+static inline uint64_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+	return key_def->tuple_hint(tuple, key_def);
+}
+
+/**
+ * Get a comparison hint of a @a key.
+ * @param key - key to get hint of.
+ * @param key_def - key_def that defines which comparison is used.
+ * @return the comparison auxiliary information.
+ */
+static inline uint64_t
+key_hint(const char *key, struct key_def *key_def)
+{
+	return key_def->key_hint(key, key_def);
+}
+
 #if defined(__cplusplus)
 } /* extern "C" */
 #endif /* defined(__cplusplus) */
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index d5b42efb6..2c1933ceb 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -47,6 +47,11 @@ struct memtx_tree_key_data {
 	const char *key;
 	/** Number of msgpacked search fields. */
 	uint32_t part_count;
+	/**
+	 * Comparison hint is calculated with
+	 * key_hint(key, ...).
+	 */
+	uint64_t hint;
 };
 
 /**
@@ -55,6 +60,11 @@ struct memtx_tree_key_data {
 struct memtx_tree_data {
 	/* Tuple that this node is represents. */
 	struct tuple *tuple;
+	/**
+	 * Comparison hint is calculated with
+	 * tuple_hint(tuple, ...).
+	 */
+	uint64_t hint;
 };
 
 /**
@@ -69,16 +79,18 @@ static bool
 memtx_tree_data_identical(const struct memtx_tree_data *a,
 			  const struct memtx_tree_data *b)
 {
-	return a->tuple == b->tuple;
+	return a->tuple == b->tuple && a->hint == b->hint;
 }
 
 #define BPS_TREE_NAME memtx_tree
 #define BPS_TREE_BLOCK_SIZE (512)
 #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
 #define BPS_TREE_COMPARE(a, b, arg)\
-	tuple_compare((&a)->tuple, (&b)->tuple, arg)
+	tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
+			     (&b)->hint, arg)
 #define BPS_TREE_COMPARE_KEY(a, b, arg)\
-	tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg)
+	tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\
+				      (b)->part_count, (b)->hint, arg)
 #define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b)
 #define bps_tree_elem_t struct memtx_tree_data
 #define bps_tree_key_t struct memtx_tree_key_data *
@@ -113,7 +125,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c)
 	const struct memtx_tree_data *data_a = a;
 	const struct memtx_tree_data *data_b = b;
 	struct key_def *key_def = c;
-	return tuple_compare(data_a->tuple, data_b->tuple, key_def);
+	return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple,
+				    data_b->hint, key_def);
 }
 
 /* {{{ MemtxTree Iterators ****************************************/
@@ -235,9 +248,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret)
 		memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
 	/* Use user key def to save a few loops. */
 	if (res == NULL ||
-	    tuple_compare_with_key(res->tuple, it->key_data.key,
-				   it->key_data.part_count,
-				   it->index_def->key_def) != 0) {
+	    tuple_compare_with_key_hinted(res->tuple, res->hint,
+					  it->key_data.key,
+					  it->key_data.part_count,
+					  it->key_data.hint,
+					  it->index_def->key_def) != 0) {
 		iterator->next = tree_iterator_dummie;
 		it->current.tuple = NULL;
 		*ret = NULL;
@@ -266,9 +281,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret)
 		memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
 	/* Use user key def to save a few loops. */
 	if (res == NULL ||
-	    tuple_compare_with_key(res->tuple, it->key_data.key,
-				   it->key_data.part_count,
-				   it->index_def->key_def) != 0) {
+	    tuple_compare_with_key_hinted(res->tuple, res->hint,
+					  it->key_data.key,
+					  it->key_data.part_count,
+					  it->key_data.hint,
+					  it->index_def->key_def) != 0) {
 		iterator->next = tree_iterator_dummie;
 		it->current.tuple = NULL;
 		*ret = NULL;
@@ -516,9 +533,11 @@ memtx_tree_index_get(struct index *base, const char *key,
 	assert(base->def->opts.is_unique &&
 	       part_count == base->def->key_def->part_count);
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
 	struct memtx_tree_key_data key_data;
 	key_data.key = key;
 	key_data.part_count = part_count;
+	key_data.hint = key_hint(key, cmp_def);
 	struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
 	*result = res != NULL ? res->tuple : NULL;
 	return 0;
@@ -530,9 +549,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
 			 struct tuple **result)
 {
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *key_def = index->tree.arg;
 	if (new_tuple) {
 		struct memtx_tree_data new_data;
 		new_data.tuple = new_tuple;
+		new_data.hint = tuple_hint(new_tuple, key_def);
 		struct memtx_tree_data dup_data;
 		dup_data.tuple = NULL;
 
@@ -565,6 +586,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
 	if (old_tuple) {
 		struct memtx_tree_data old_data;
 		old_data.tuple = old_tuple;
+		old_data.hint = tuple_hint(old_tuple, key_def);
 		memtx_tree_delete(&index->tree, old_data);
 	}
 	*result = old_tuple;
@@ -607,6 +629,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
 	it->type = type;
 	it->key_data.key = key;
 	it->key_data.part_count = part_count;
+	struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
+	it->key_data.hint = key_hint(key, cmp_def);
 	it->index_def = base->def;
 	it->tree = &index->tree;
 	it->tree_iterator = memtx_tree_invalid_iterator();
@@ -671,6 +695,8 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
 	struct memtx_tree_data *elem =
 		&index->build_array[index->build_array_size++];
 	elem->tuple = tuple;
+	struct key_def *key_def = index->tree.arg;
+	elem->hint = tuple_hint(tuple, key_def);
 	return 0;
 }
 
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 1bcff2ca3..86922c9bb 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -34,6 +34,383 @@
 #include "trivia/util.h" /* NOINLINE */
 #include <math.h>
 
+/**
+ * Tuple hints
+ *
+ * Hint is a value h(t) is calculated for tuple in terms
+ * of particular key_def that has the following rules:
+ * if h(t1) < h(t2) then t1 < t2;
+ * if h(t1) > h(t2) then t1 > t2;
+ * if t1 == t2 then regular comparision is required;
+ * These rules mean that instead of direct tuple vs tuple
+ * (or tuple vs key) comparison one may compare theirs
+ * hints first; and only if theirs hints equal compare
+ * the tuples themselves.
+ *
+ * The comparison hint has the following structure:
+ * uint64_t: [ CMP_HINT_TYPE | CMP_HINT_VALUE ]
+ *           64              62               0 bit
+ *
+ * The reserved value HINT_INVALID is returned when hint is
+ * undefined.
+ */
+#define HINT_MAX (((int64_t)1 << 61) - 1)
+#define HINT_MIN (-HINT_MAX - 1)
+#define HINT_UMAX (((uint64_t)1 << 62) - 1)
+
+#define HINT_INVALID UINT64_MAX
+
+#define HINT_TYPE_NUMBER ((uint8_t)1)
+#define HINT_TYPE_STRING ((uint8_t)2)
+#define HINT_TYPE_BOOL   ((uint8_t)3)
+
+#define HINT_TYPE(hint) ((uint64_t)hint >> 62)
+#define HINT_VALUE(hint) ((uint64_t)hint & (((uint64_t)1 << 62) - 1))
+#define HINT(type, value) ({ \
+	assert(HINT_TYPE(value) == 0); \
+	(((uint64_t)type << 62) | (uint64_t)value);})
+
+/**
+ * Perform hints comparison.
+ * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a;
+ * If the hints are incompatible or equal, a value of 0 is
+ * returned, and the further comparison of the original data is
+ * required.
+ */
+static int
+hint_cmp(uint64_t hint_a, uint64_t hint_b)
+{
+	if (hint_a != HINT_INVALID && hint_b != HINT_INVALID &&
+	    HINT_TYPE(hint_a) == HINT_TYPE(hint_b) &&
+	    HINT_VALUE(hint_a) != HINT_VALUE(hint_b))
+		return hint_a < hint_b ? -1 : 1;
+	else
+		return 0;
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_uint(const char *key, struct key_def *key_def)
+{
+	(void)key_def;
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
+	uint64_t ret;
+	if ((is_optional && key == NULL) ||
+	    (is_nullable && mp_typeof(*key) == MP_NIL))
+		return HINT(HINT_TYPE_NUMBER, 0);
+	assert(mp_typeof(*key) == MP_UINT);
+	uint64_t val = mp_decode_uint(&key);
+	ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
+	return HINT(HINT_TYPE_NUMBER, ret);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_uint(const struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return HINT(HINT_TYPE_NUMBER, 0);
+	return key_hint_uint<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_int(const char *key, struct key_def *key_def)
+{
+	(void)key_def;
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_INTEGER);
+	uint64_t ret;
+	if ((is_optional && key == NULL) ||
+	    (is_nullable && mp_typeof(*key) == MP_NIL))
+		return HINT(HINT_TYPE_NUMBER, 0);
+	if (mp_typeof(*key) == MP_UINT) {
+		uint64_t val = mp_decode_uint(&key);
+		ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
+	} else {
+		assert(mp_typeof(*key) == MP_INT);
+		int64_t val = mp_decode_int(&key);
+		ret = val > HINT_MAX ? HINT_UMAX :
+		      val < HINT_MIN ? 0 :
+		      (uint64_t)val - (uint64_t)HINT_MIN;
+	}
+	return HINT(HINT_TYPE_NUMBER, ret);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_int(const struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_INTEGER);
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return HINT(HINT_TYPE_NUMBER, 0);
+	return key_hint_int<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_number(const char *key, struct key_def *key_def)
+{
+	(void)key_def;
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_NUMBER ||
+	       key_def->parts->type == FIELD_TYPE_SCALAR);
+	if ((is_optional && key == NULL) ||
+	    (is_nullable && mp_typeof(*key) == MP_NIL))
+		return HINT(HINT_TYPE_NUMBER, 0);
+	uint64_t ret;
+	switch (mp_typeof(*key)) {
+	case MP_FLOAT:
+	case MP_DOUBLE: {
+		double f = mp_typeof(*key) == MP_FLOAT ?
+			   mp_decode_float(&key) : mp_decode_double(&key);
+		if (!isfinite(f))
+			return HINT_INVALID;
+		double val;
+		(void)modf(f, &val);
+		ret = val > (double)HINT_MAX ? HINT_UMAX :
+		      val < (double)HINT_MIN ? 0 :
+		      (uint64_t)val - (uint64_t)HINT_MIN;
+		break;
+	}
+	case MP_INT: {
+		int64_t val = (uint64_t)mp_decode_int(&key);
+		ret = val > HINT_MAX ? HINT_UMAX :
+		      val < HINT_MIN ? 0 :
+		      (uint64_t)val - (uint64_t)HINT_MIN;
+		break;
+	}
+	case MP_UINT: {
+		uint64_t val = mp_decode_uint(&key);
+		ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
+		break;
+	}
+	default:
+		unreachable();
+	}
+	return HINT(HINT_TYPE_NUMBER, ret);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_number(const struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_NUMBER);
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return HINT(HINT_TYPE_NUMBER, 0);
+	return key_hint_number<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_boolean(const char *key, struct key_def *key_def)
+{
+	(void)key_def;
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_BOOLEAN ||
+	       key_def->parts->type == FIELD_TYPE_SCALAR);
+	if ((is_optional && key == NULL) ||
+	    (is_nullable && mp_typeof(*key) == MP_NIL))
+		return HINT(HINT_TYPE_BOOL, 0);
+	return (uint64_t)mp_decode_bool(&key);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_boolean(const struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_BOOLEAN);
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return HINT(HINT_TYPE_BOOL, 0);
+	return key_hint_boolean<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_string(const char *key, struct key_def *key_def)
+{
+	(void)key_def;
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->coll == NULL);
+	assert(key_def->parts->type == FIELD_TYPE_STRING ||
+	       key_def->parts->type == FIELD_TYPE_SCALAR);
+	if ((is_optional && key == NULL) ||
+	    (is_nullable && mp_typeof(*key) == MP_NIL))
+		return HINT(HINT_TYPE_STRING, 0);
+	assert(mp_typeof(*key) == MP_STR);
+	uint32_t len;
+	const unsigned char *str =
+		(const unsigned char *)mp_decode_str(&key, &len);
+	uint64_t result = 0;
+	uint32_t process_len = MIN(len, 7);
+	for (uint32_t i = 0; i < process_len; i++) {
+		result <<= 8;
+		result |= str[i];
+	}
+	result <<= 8 * (7 - process_len);
+	return HINT(HINT_TYPE_STRING, result);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_string(const struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->coll == NULL);
+	assert(key_def->parts->type == FIELD_TYPE_STRING);
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return HINT(HINT_TYPE_STRING, 0);
+	return key_hint_string<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_string_coll(const char *key, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert((key_def->parts->type == FIELD_TYPE_STRING ||
+	        key_def->parts->type == FIELD_TYPE_SCALAR) &&
+	        key_def->parts->coll != NULL);
+	if ((is_optional && key == NULL) ||
+	    (is_nullable && mp_typeof(*key) == MP_NIL))
+		return HINT(HINT_TYPE_STRING, 0);
+	assert(mp_typeof(*key) == MP_STR);
+	uint32_t len;
+	const char *str = mp_decode_str(&key, &len);
+	return key_def->parts->coll->hint(str, len, key_def->parts->coll);
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_string_coll(const struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_STRING &&
+	        key_def->parts->coll != NULL);
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return HINT(HINT_TYPE_STRING, 0);
+	return key_hint_string_coll<false, is_nullable>(field, key_def);
+}
+
+template<bool is_optional, bool is_nullable>
+static uint64_t
+key_hint_scalar(const char *key, struct key_def *key_def)
+{
+	(void)key_def;
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_SCALAR);
+	if ((is_optional && key == NULL) ||
+	    (is_nullable && mp_typeof(*key) == MP_NIL))
+		return HINT(HINT_TYPE_BOOL, 0);
+	switch (mp_typeof(*key)) {
+	case MP_INT:
+	case MP_UINT:
+	case MP_FLOAT:
+	case MP_DOUBLE:
+		return key_hint_number<is_optional, is_nullable>(key, key_def);
+	case MP_BOOL:
+		return key_hint_boolean<is_optional, is_nullable>(key, key_def);
+	case MP_STR:
+		if (key_def->parts->coll == NULL)
+			return key_hint_string<is_optional,
+					is_nullable>(key, key_def);
+		else
+			return key_hint_string_coll<is_optional,
+					is_nullable>(key, key_def);
+	default:
+		return HINT_INVALID;
+	}
+}
+
+template<bool is_nullable>
+static uint64_t
+tuple_hint_scalar(const struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(key_def->parts->type == FIELD_TYPE_SCALAR);
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return HINT(HINT_TYPE_BOOL, 0);
+	return key_hint_scalar<false, is_nullable>(field, key_def);
+}
+
+void
+key_def_set_hint_func(struct key_def *def)
+{
+	def->key_hint = NULL;
+	def->tuple_hint = NULL;
+	bool is_nullable = key_part_is_nullable(def->parts);
+	switch (def->parts->type) {
+	case FIELD_TYPE_UNSIGNED: {
+		def->key_hint = is_nullable ? key_hint_uint<true, true> :
+					      key_hint_uint<true, false>;
+		def->tuple_hint = is_nullable ? tuple_hint_uint<true> :
+						tuple_hint_uint<false>;
+		break;
+	}
+	case FIELD_TYPE_INTEGER: {
+		def->key_hint = is_nullable ? key_hint_int<true, true> :
+					      key_hint_int<true, false>;
+		def->tuple_hint = is_nullable ? tuple_hint_int<true> :
+						tuple_hint_int<false>;
+		break;
+	}
+	case FIELD_TYPE_STRING: {
+		if (def->parts->coll != NULL) {
+			def->key_hint =
+				is_nullable ? key_hint_string_coll<true, true> :
+					      key_hint_string_coll<true, false>;
+			def->tuple_hint =
+				is_nullable ? tuple_hint_string_coll<true> :
+					      tuple_hint_string_coll<false>;
+		} else {
+			def->key_hint =
+				is_nullable ? key_hint_string<true, true> :
+					      key_hint_string<true, false>;
+			def->tuple_hint = is_nullable ? tuple_hint_string<true> :
+							tuple_hint_string<false>;
+		}
+		break;
+	}
+	case FIELD_TYPE_NUMBER: {
+		def->key_hint = is_nullable ? key_hint_number<true, true> :
+					      key_hint_number<true, false>;
+		def->tuple_hint = is_nullable ? tuple_hint_number<true> :
+						tuple_hint_number<false>;
+		break;
+	}
+	case FIELD_TYPE_BOOLEAN: {
+		def->key_hint = is_nullable ? key_hint_boolean<true, true> :
+					      key_hint_boolean<true, false>;
+		def->tuple_hint = is_nullable ? tuple_hint_boolean<true> :
+						tuple_hint_boolean<false>;
+		break;
+	}
+	case FIELD_TYPE_SCALAR: {
+		def->key_hint = is_nullable ? key_hint_scalar<true, true> :
+					      key_hint_scalar<true, false>;
+		def->tuple_hint = is_nullable ? tuple_hint_scalar<true> :
+					        tuple_hint_scalar<false>;
+		break;
+	}
+	default: {
+		break;
+	}
+	}
+}
+
 /* {{{ tuple_compare */
 
 /*
@@ -460,13 +837,17 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
 
 template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
-tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
-		       struct key_def *key_def)
+tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
+		uint64_t tuple_a_hint, const struct tuple *tuple_b,
+		uint64_t tuple_b_hint, struct key_def *key_def)
 {
 	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
+	int rc = 0;
+	if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
+		return rc;
 	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
@@ -502,7 +883,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	struct key_part *end;
 	const char *field_a, *field_b;
 	enum mp_type a_type, b_type;
-	int rc;
 	if (is_nullable)
 		end = part + key_def->unique_part_count;
 	else
@@ -592,8 +972,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 
 template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
-tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
-				uint32_t part_count, struct key_def *key_def)
+tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
+		       struct key_def *key_def)
+{
+	return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts,
+			has_json_paths>(tuple_a, HINT_INVALID, tuple_b,
+					HINT_INVALID, key_def);
+}
+
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
+		uint64_t tuple_hint, const char *key, uint32_t part_count,
+		uint64_t key_hint, struct key_def *key_def)
 {
 	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
@@ -601,6 +992,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key != NULL || part_count == 0);
 	assert(part_count <= key_def->part_count);
+	int rc = 0;
+	if ((rc = hint_cmp(tuple_hint, key_hint)) != 0)
+		return rc;
 	struct key_part *part = key_def->parts;
 	struct tuple_format *format = tuple_format(tuple);
 	const char *tuple_raw = tuple_data(tuple);
@@ -636,7 +1030,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	}
 
 	struct key_part *end = part + part_count;
-	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
 		const char *field;
 		if (has_json_paths) {
@@ -675,6 +1068,17 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	return 0;
 }
 
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
+				uint32_t part_count, struct key_def *key_def)
+{
+	return tuple_compare_with_key_slowpath_hinted<is_nullable,
+			has_optional_parts, has_json_paths>(tuple,
+					HINT_INVALID, key, part_count,
+					HINT_INVALID, key_def);
+}
+
 template<bool is_nullable>
 static inline int
 key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
@@ -733,13 +1137,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
 
 template<bool is_nullable, bool has_optional_parts>
 static inline int
-tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
-				  uint32_t part_count, struct key_def *key_def)
+tuple_compare_with_key_sequential_hinted(const struct tuple *tuple,
+		uint64_t tuple_hint, const char *key, uint32_t part_count,
+		uint64_t key_hint, struct key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(key_def_is_sequential(key_def));
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
+	int rc = hint_cmp(tuple_hint, key_hint);
+	if (rc != 0)
+		return rc;
 	const char *tuple_key = tuple_data(tuple);
 	uint32_t field_count = mp_decode_array(&tuple_key);
 	uint32_t cmp_part_count;
@@ -749,8 +1157,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
 		assert(field_count >= part_count);
 		cmp_part_count = part_count;
 	}
-	int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
-						key_def);
+	rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
+					    key_def);
 	if (!has_optional_parts || rc != 0)
 		return rc;
 	/*
@@ -772,6 +1180,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
 	return 0;
 }
 
+template<bool is_nullable, bool has_optional_parts>
+static inline int
+tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
+				  uint32_t part_count, struct key_def *key_def)
+{
+	return tuple_compare_with_key_sequential_hinted<is_nullable,
+			has_optional_parts>(tuple, HINT_INVALID, key,
+					    part_count, HINT_INVALID, key_def);
+}
+
 int
 key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
 {
@@ -792,9 +1210,13 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
 
 template <bool is_nullable, bool has_optional_parts>
 static int
-tuple_compare_sequential(const struct tuple *tuple_a,
-			 const struct tuple *tuple_b, key_def *key_def)
+tuple_compare_sequential_hinted(const struct tuple *tuple_a,
+		uint64_t tuple_a_hint, const struct tuple *tuple_b,
+		uint64_t tuple_b_hint, key_def *key_def)
 {
+	int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+	if (rc != 0)
+		return rc;
 	assert(!has_optional_parts || is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key_def_is_sequential(key_def));
@@ -812,7 +1234,6 @@ tuple_compare_sequential(const struct tuple *tuple_a,
 	bool was_null_met = false;
 	struct key_part *part = key_def->parts;
 	struct key_part *end = part + key_def->unique_part_count;
-	int rc;
 	uint32_t i = 0;
 	for (; part < end; ++part, ++i) {
 		enum mp_type a_type, b_type;
@@ -859,6 +1280,16 @@ tuple_compare_sequential(const struct tuple *tuple_a,
 	return 0;
 }
 
+template <bool is_nullable, bool has_optional_parts>
+static int
+tuple_compare_sequential(const struct tuple *tuple_a,
+			 const struct tuple *tuple_b, key_def *key_def)
+{
+	return tuple_compare_sequential_hinted<is_nullable,
+			has_optional_parts>(tuple_a, HINT_INVALID, tuple_b,
+					    HINT_INVALID, key_def);
+}
+
 template <int TYPE>
 static inline int
 field_compare(const char **field_a, const char **field_b);
@@ -989,6 +1420,18 @@ struct TupleCompare
 			compare(tuple_a, tuple_b, format_a,
 				format_b, field_a, field_b);
 	}
+
+	static int compare_hinted(const struct tuple *tuple_a,
+				  uint64_t tuple_a_hint,
+				  const struct tuple *tuple_b,
+				  uint64_t tuple_b_hint,
+				  struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple_a, tuple_b, key_def);
+	}
 };
 
 template <int TYPE, int ...MORE_TYPES>
@@ -1006,15 +1449,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> {
 		return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b,
 					format_a, format_b, field_a, field_b);
 	}
+
+	static int compare_hinted(const struct tuple *tuple_a,
+				  uint64_t tuple_a_hint,
+				  const struct tuple *tuple_b,
+				  uint64_t tuple_b_hint,
+				  struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple_a, tuple_b, key_def);
+	}
 };
 } /* end of anonymous namespace */
 
 struct comparator_signature {
 	tuple_compare_t f;
+	tuple_compare_hinted_t f_hinted;
 	uint32_t p[64];
 };
 #define COMPARATOR(...) \
-	{ TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } },
+	{ TupleCompare<__VA_ARGS__>::compare, \
+	  TupleCompare<__VA_ARGS__>::compare_hinted, \
+	  { __VA_ARGS__, UINT32_MAX } },
 
 /**
  * field1 no, field1 type, field2 no, field2 type, ...
@@ -1049,6 +1507,17 @@ static const tuple_compare_t compare_slowpath_funcs[] = {
 	tuple_compare_slowpath<true, true, true>
 };
 
+static const tuple_compare_hinted_t compare_slowpath_hinted_funcs[] = {
+	tuple_compare_slowpath_hinted<false, false, false>,
+	tuple_compare_slowpath_hinted<true, false, false>,
+	tuple_compare_slowpath_hinted<false, true, false>,
+	tuple_compare_slowpath_hinted<true, true, false>,
+	tuple_compare_slowpath_hinted<false, false, true>,
+	tuple_compare_slowpath_hinted<true, false, true>,
+	tuple_compare_slowpath_hinted<false, true, true>,
+	tuple_compare_slowpath_hinted<true, true, true>
+};
+
 void
 key_def_set_tuple_compare(struct key_def *def)
 {
@@ -1060,13 +1529,21 @@ key_def_set_tuple_compare(struct key_def *def)
 			if (def->has_optional_parts) {
 				def->tuple_compare =
 					tuple_compare_sequential<true, true>;
+				def->tuple_compare_hinted =
+					tuple_compare_sequential_hinted<true,
+									true>;
 			} else {
 				def->tuple_compare =
 					tuple_compare_sequential<true, false>;
+				def->tuple_compare_hinted =
+					tuple_compare_sequential_hinted<true,
+									false>;
 			}
 		} else {
 			def->tuple_compare =
 				compare_slowpath_funcs[cmp_func_idx];
+			def->tuple_compare_hinted =
+				compare_slowpath_hinted_funcs[cmp_func_idx];
 		}
 		return;
 	}
@@ -1085,13 +1562,20 @@ key_def_set_tuple_compare(struct key_def *def)
 			if (i == def->part_count &&
 			    cmp_arr[k].p[i * 2] == UINT32_MAX) {
 				def->tuple_compare = cmp_arr[k].f;
+				def->tuple_compare_hinted = cmp_arr[k].f_hinted;
 				return;
 			}
 		}
 	}
-	def->tuple_compare = key_def_is_sequential(def) ?
-			     tuple_compare_sequential<false, false> :
-			     compare_slowpath_funcs[cmp_func_idx];
+	if (key_def_is_sequential(def)) {
+		def->tuple_compare = tuple_compare_sequential<false, false>;
+		def->tuple_compare_hinted =
+			tuple_compare_sequential_hinted<false, false>;
+	} else {
+		def->tuple_compare = compare_slowpath_funcs[cmp_func_idx];
+		def->tuple_compare_hinted =
+			compare_slowpath_hinted_funcs[cmp_func_idx];
+	}
 }
 
 /* }}} tuple_compare */
@@ -1222,6 +1706,17 @@ struct TupleCompareWithKey
 				compare(tuple, key, part_count,
 					key_def, format, field);
 	}
+
+	static int
+	compare_hinted(const struct tuple *tuple, uint64_t tuple_hint,
+		       const char *key, uint32_t part_count, uint64_t key_hint,
+		       struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_hint, key_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple, key, part_count, key_def);
+	}
 };
 
 template <int TYPE, int ...MORE_TYPES>
@@ -1242,6 +1737,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
 			compare(tuple, key, part_count,
 				key_def, format, field);
 	}
+
+	static int
+	compare_hinted(const struct tuple *tuple, uint64_t tuple_hint,
+		       const char *key, uint32_t part_count, uint64_t key_hint,
+		       struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_hint, key_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple, key, part_count, key_def);
+	}
 };
 
 } /* end of anonymous namespace */
@@ -1249,11 +1755,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
 struct comparator_with_key_signature
 {
 	tuple_compare_with_key_t f;
+	tuple_compare_with_key_hinted_t f_hinted;
 	uint32_t p[64];
 };
 
 #define KEY_COMPARATOR(...) \
-	{ TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
+	{ TupleCompareWithKey<0, __VA_ARGS__>::compare, \
+	  TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \
+	  { __VA_ARGS__ } },
 
 static const comparator_with_key_signature cmp_wk_arr[] = {
 	KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
@@ -1284,6 +1793,18 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
 	tuple_compare_with_key_slowpath<true, true, true>
 };
 
+static const tuple_compare_with_key_hinted_t
+			compare_with_key_slowpath_hinted_funcs[] = {
+	tuple_compare_with_key_slowpath_hinted<false, false, false>,
+	tuple_compare_with_key_slowpath_hinted<true, false, false>,
+	tuple_compare_with_key_slowpath_hinted<false, true, false>,
+	tuple_compare_with_key_slowpath_hinted<true, true, false>,
+	tuple_compare_with_key_slowpath_hinted<false, false, true>,
+	tuple_compare_with_key_slowpath_hinted<true, false, true>,
+	tuple_compare_with_key_slowpath_hinted<false, true, true>,
+	tuple_compare_with_key_slowpath_hinted<true, true, true>
+};
+
 void
 key_def_set_tuple_compare_with_key(struct key_def *def)
 {
@@ -1296,14 +1817,23 @@ key_def_set_tuple_compare_with_key(struct key_def *def)
 				def->tuple_compare_with_key =
 					tuple_compare_with_key_sequential<true,
 									 true>;
+				def->tuple_compare_with_key_hinted =
+					tuple_compare_with_key_sequential_hinted<
+								true, true>;
 			} else {
 				def->tuple_compare_with_key =
 					tuple_compare_with_key_sequential<true,
 									 false>;
+				def->tuple_compare_with_key_hinted =
+					tuple_compare_with_key_sequential_hinted<
+								true, false>;
 			}
 		} else {
 			def->tuple_compare_with_key =
 				compare_with_key_slowpath_funcs[cmp_func_idx];
+			def->tuple_compare_with_key_hinted =
+				compare_with_key_slowpath_hinted_funcs[
+								cmp_func_idx];
 		}
 		return;
 	}
@@ -1325,14 +1855,23 @@ key_def_set_tuple_compare_with_key(struct key_def *def)
 			}
 			if (i == def->part_count) {
 				def->tuple_compare_with_key = cmp_wk_arr[k].f;
+				def->tuple_compare_with_key_hinted =
+							cmp_wk_arr[k].f_hinted;
 				return;
 			}
 		}
 	}
-	def->tuple_compare_with_key =
-		key_def_is_sequential(def) ?
-		tuple_compare_with_key_sequential<false, false> :
-		compare_with_key_slowpath_funcs[cmp_func_idx];
+	if (key_def_is_sequential(def)) {
+		def->tuple_compare_with_key =
+			tuple_compare_with_key_sequential<false, false>;
+		def->tuple_compare_with_key_hinted =
+			tuple_compare_with_key_sequential_hinted<false, false>;
+	} else {
+		def->tuple_compare_with_key =
+			compare_with_key_slowpath_funcs[cmp_func_idx];
+		def->tuple_compare_with_key_hinted =
+			compare_with_key_slowpath_hinted_funcs[cmp_func_idx];
+	}
 }
 
 /* }}} tuple_compare_with_key */
@@ -1342,4 +1881,5 @@ key_def_set_compare_func(struct key_def *def)
 {
 	key_def_set_tuple_compare(def);
 	key_def_set_tuple_compare_with_key(def);
+	key_def_set_hint_func(def);
 }
diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
index afc15e809..3bd7c44f5 100644
--- a/src/lib/coll/coll.c
+++ b/src/lib/coll/coll.c
@@ -133,6 +133,37 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
 	return s_len;
 }
 
+/** Get a compare hint of a binary collation. */
+static uint64_t
+coll_bin_hint(const char *s, size_t s_len, struct coll *coll)
+{
+	(void)coll;
+	uint64_t result = 0;
+	uint32_t process_len = MIN(s_len, 7);
+	for (uint32_t i = 0; i < process_len; i++) {
+		result <<= 8;
+		result |= ((unsigned char *)s)[i];
+	}
+	result <<= 8 * (7 - process_len);
+	return result;
+}
+
+/** Get a compare hint of a string using ICU collation. */
+static uint64_t
+coll_icu_hint(const char *s, size_t s_len, struct coll *coll)
+{
+	assert(coll->type == COLL_TYPE_ICU);
+	UCharIterator itr;
+	uiter_setUTF8(&itr, s, s_len);
+	uint8_t buf[7];
+	uint32_t state[2] = {0, 0};
+	UErrorCode status = U_ZERO_ERROR;
+	int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state, buf,
+					   sizeof(buf), &status);
+	assert(got >= 0 && got <= 7);
+	return coll_bin_hint((const char *)buf, got, NULL);
+}
+
 /**
  * Set up ICU collator and init cmp and hash members of collation.
  * @param coll Collation to set up.
@@ -242,6 +273,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
 	}
 	coll->cmp = coll_icu_cmp;
 	coll->hash = coll_icu_hash;
+	coll->hint = coll_icu_hint;
 	return 0;
 }
 
@@ -356,6 +388,7 @@ coll_new(const struct coll_def *def)
 		coll->collator = NULL;
 		coll->cmp = coll_bin_cmp;
 		coll->hash = coll_bin_hash;
+		coll->hint = coll_bin_hint;
 		break;
 	default:
 		unreachable();
diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h
index 8c9f94293..d695a02ad 100644
--- a/src/lib/coll/coll.h
+++ b/src/lib/coll/coll.h
@@ -48,6 +48,8 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
 typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
 				uint32_t *pcarry, struct coll *coll);
 
+typedef uint64_t (*coll_hint_f)(const char *s, size_t s_len, struct coll *coll);
+
 struct UCollator;
 
 /**
@@ -62,6 +64,8 @@ struct coll {
 	/** String comparator. */
 	coll_cmp_f cmp;
 	coll_hash_f hash;
+	/** The pointer to routine calculating tuple hint. */
+	coll_hint_f hint;
 	/** Reference counter. */
 	int refs;
 	/**
-- 
2.21.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v6 3/3] box: introduce multikey indexes
  2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
  2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
  2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
@ 2019-03-13 12:15 ` Kirill Shcherbatov
  2 siblings, 0 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-13 12:15 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

I didn't finished review fixes for multikey indexes, but I've
included this part to show how the modified hints changes affect
multikey indexes.

=========================================================================

With multikey index feature introduction, JSON path may have [*]
placeholder instead of array index: this allows to index
multiple documents by one JSON path automatically.

Example:
s = box.schema.space.create('withdata')
idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'},
                              {3, 'str', path = '[*].sname'}}})
_ = s:insert({1, 1, {{fname='James', sname='Bond'},
             {fname='Austin', sname='Powers'}}})
idx:get({'Austin', 'Powers'})
---
- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Austin',
'sname': 'Powers'}]]
...

Closes #1257
---
 src/box/key_def.c         |  15 ++++
 src/box/key_def.h         |   5 ++
 src/box/memtx_tree.c      | 175 +++++++++++++++++++++++++++++++++++++-
 src/box/tuple.c           |   8 +-
 src/box/tuple.h           | 122 +++++++++++++++++++++++---
 src/box/tuple_compare.cc  | 115 +++++++++++++++++++------
 src/box/tuple_format.c    | 120 +++++++++++++++++++++-----
 test/engine/json.result   |  80 ++++++++++++++++-
 test/engine/json.test.lua |  20 +++++
 9 files changed, 592 insertions(+), 68 deletions(-)

diff --git a/src/box/key_def.c b/src/box/key_def.c
index d4c979aa1..b9cc77699 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -164,9 +164,24 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 		*path_pool += path_len;
 		memcpy(def->parts[part_no].path, path, path_len);
 		def->parts[part_no].path_len = path_len;
+
+		int rc;
+		struct json_lexer lexer;
+		struct json_token token;
+		json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
+		while ((rc = json_lexer_next_token(&lexer, &token)) == 0) {
+			if (token.type == JSON_TOKEN_ANY) {
+				def->has_multikey_parts = true;
+				def->parts[part_no].is_multikey = true;
+				break;
+			} else if (token.type == JSON_TOKEN_END) {
+				break;
+			}
+		}
 	} else {
 		def->parts[part_no].path = NULL;
 		def->parts[part_no].path_len = 0;
+		def->parts[part_no].is_multikey = false;
 	}
 	column_mask_set_fieldno(&def->column_mask, fieldno);
 }
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 0d8f3112e..6a914df11 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -97,6 +97,8 @@ struct key_part {
 	char *path;
 	/** The length of JSON path. */
 	uint32_t path_len;
+	/** True if this is multikey key part. */
+	bool is_multikey;
 	/**
 	 * Epoch of the tuple format the offset slot cached in
 	 * this part is valid for, see tuple_format::epoch.
@@ -205,6 +207,8 @@ struct key_def {
 	bool is_nullable;
 	/** True if some key part has JSON path. */
 	bool has_json_paths;
+	/** True if some part has array index placeholder *. */
+	bool has_multikey_parts;
 	/**
 	 * True, if some key parts can be absent in a tuple. These
 	 * fields assumed to be MP_NIL.
@@ -699,6 +703,7 @@ key_hash(const char *key, struct key_def *key_def)
  * Get a comparison hint for a @a tuple.
  * @param tuple - tuple to get uint64_t of.
  * @param key_def - key_def that defines which comparison is used.
+ * @param multikey_idx - index of multikey array item.
  * @return the comparison auxiliary information.
  */
 static inline uint64_t
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index 2c1933ceb..bfc618bdf 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -537,7 +537,8 @@ memtx_tree_index_get(struct index *base, const char *key,
 	struct memtx_tree_key_data key_data;
 	key_data.key = key;
 	key_data.part_count = part_count;
-	key_data.hint = key_hint(key, cmp_def);
+	if (!cmp_def->has_multikey_parts)
+		key_data.hint = key_hint(key, cmp_def);
 	struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
 	*result = res != NULL ? res->tuple : NULL;
 	return 0;
@@ -593,6 +594,98 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
 	return 0;
 }
 
+static int
+multikey_get_array_sz(struct tuple *tuple, struct key_def *key_def)
+{
+	assert(key_def->has_multikey_parts);
+	struct key_part *part = key_def->parts;
+	/* FIXME: don't like it... */
+	while (part < key_def->parts + key_def->part_count &&
+	       !part->is_multikey)
+		part++;
+	assert(part->path != NULL && part->is_multikey);
+	struct tuple_field *field =
+		tuple_format_field_by_path(tuple_format(tuple), part->fieldno,
+					   part->path, part->path_len);
+	assert(field != NULL);
+	struct field_map_ext *field_map_ext =
+		tuple_field_map_ext((uint32_t *)tuple_field_map(tuple),
+				    field->offset_slot);
+	return field_map_ext->size;
+}
+
+int
+memtx_multikey_tree_index_replace(struct index *base, struct tuple *old_tuple,
+				  struct tuple *new_tuple,
+				  enum dup_replace_mode mode,
+				  struct tuple **result)
+{
+	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *key_def = index->tree.arg;
+	if (new_tuple != NULL) {
+		int errcode = 0, tree_res = 0;
+		struct tuple *dup_tuple = NULL;
+		int multikey_idx = 0;
+		int sz = multikey_get_array_sz(new_tuple, key_def);
+		for (; multikey_idx < sz; multikey_idx++) {
+			struct memtx_tree_data new_data;
+			new_data.tuple = new_tuple;
+			new_data.hint = multikey_idx;
+			struct memtx_tree_data dup_data;
+			dup_data.tuple = NULL;
+			tree_res = memtx_tree_insert(&index->tree, new_data,
+						     &dup_data);
+			if (tree_res != 0) {
+				diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
+					 "memtx_tree_index", "replace");
+				break;
+			}
+			dup_tuple = dup_data.tuple;
+			errcode = replace_check_dup(old_tuple, dup_tuple, mode);
+			if (errcode != 0) {
+				memtx_tree_delete(&index->tree, new_data);
+				if (dup_tuple != NULL) {
+					memtx_tree_insert(&index->tree,
+							  dup_data, NULL);
+				}
+				struct space *sp =
+					space_cache_find(base->def->space_id);
+				if (sp != NULL) {
+					diag_set(ClientError, errcode,
+						 base->def->name,
+						 space_name(sp));
+				}
+				break;
+			}
+		}
+		if (tree_res != 0 || errcode != 0) {
+			multikey_idx--;
+			for (; multikey_idx >= 0; multikey_idx--) {
+				struct memtx_tree_data data;
+				data.tuple = new_tuple;
+				data.hint = multikey_idx;
+				memtx_tree_delete(&index->tree, data);
+			}
+			return -1;
+		}
+		if (dup_tuple != NULL) {
+			*result = dup_tuple;
+			return 0;
+		}
+	}
+	if (old_tuple != NULL) {
+		int sz = multikey_get_array_sz(old_tuple, key_def);
+		for (int multikey_idx = 0; multikey_idx < sz; multikey_idx++) {
+			struct memtx_tree_data old_data;
+			old_data.tuple = old_tuple;
+			old_data.hint = multikey_idx;
+			memtx_tree_delete(&index->tree, old_data);
+		}
+	}
+	*result = old_tuple;
+	return 0;
+}
+
 static struct iterator *
 memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
 				 const char *key, uint32_t part_count)
@@ -630,7 +723,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
 	it->key_data.key = key;
 	it->key_data.part_count = part_count;
 	struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
-	it->key_data.hint = key_hint(key, cmp_def);
+	if (!cmp_def->has_multikey_parts)
+		it->key_data.hint = key_hint(key, cmp_def);
 	it->index_def = base->def;
 	it->tree = &index->tree;
 	it->tree_iterator = memtx_tree_invalid_iterator();
@@ -700,6 +794,48 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
 	return 0;
 }
 
+static int
+memtx_multikey_tree_index_build_next(struct index *base, struct tuple *tuple)
+{
+	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *key_def = index->tree.arg;
+	int sz = multikey_get_array_sz(tuple, key_def);
+
+	if (index->build_array == NULL) {
+		index->build_array =
+			(struct memtx_tree_data *)malloc(MEMTX_EXTENT_SIZE);
+		if (index->build_array == NULL) {
+			diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
+				 "memtx_tree_index", "build_next");
+			return -1;
+		}
+		index->build_array_alloc_size =
+			MEMTX_EXTENT_SIZE / sizeof(index->build_array[0]);
+	}
+	assert(index->build_array_size <= index->build_array_alloc_size);
+	if (index->build_array_size == index->build_array_alloc_size) {
+		index->build_array_alloc_size = index->build_array_alloc_size +
+					index->build_array_alloc_size / 2;
+		struct memtx_tree_data *tmp =
+			realloc(index->build_array,
+				index->build_array_alloc_size * sizeof(*tmp));
+		if (tmp == NULL) {
+			diag_set(OutOfMemory, index->build_array_alloc_size *
+				 sizeof(*tmp), "memtx_tree_index", "build_next");
+			return -1;
+		}
+		index->build_array = tmp;
+	}
+	for (int multikey_idx = 0; multikey_idx < sz; multikey_idx++) {
+		struct memtx_tree_data *elem =
+			&index->build_array[index->build_array_size++];
+		assert(index->build_array_size < index->build_array_alloc_size);
+		elem->tuple = tuple;
+		elem->hint = multikey_idx;
+	}
+	return 0;
+}
+
 static void
 memtx_tree_index_end_build(struct index *base)
 {
@@ -802,6 +938,36 @@ static const struct index_vtab memtx_tree_index_vtab = {
 	/* .end_build = */ memtx_tree_index_end_build,
 };
 
+static const struct index_vtab memtx_multikey_tree_index_vtab = {
+	/* .destroy = */ memtx_tree_index_destroy,
+	/* .commit_create = */ generic_index_commit_create,
+	/* .abort_create = */ generic_index_abort_create,
+	/* .commit_modify = */ generic_index_commit_modify,
+	/* .commit_drop = */ generic_index_commit_drop,
+	/* .update_def = */ memtx_tree_index_update_def,
+	/* .depends_on_pk = */ memtx_tree_index_depends_on_pk,
+	/* .def_change_requires_rebuild = */
+		memtx_index_def_change_requires_rebuild,
+	/* .size = */ memtx_tree_index_size,
+	/* .bsize = */ memtx_tree_index_bsize,
+	/* .min = */ generic_index_min,
+	/* .max = */ generic_index_max,
+	/* .random = */ memtx_tree_index_random,
+	/* .count = */ memtx_tree_index_count,
+	/* .get = */ memtx_tree_index_get,
+	/* .replace = */ memtx_multikey_tree_index_replace,
+	/* .create_iterator = */ memtx_tree_index_create_iterator,
+	/* .create_snapshot_iterator = */
+		memtx_tree_index_create_snapshot_iterator,
+	/* .stat = */ generic_index_stat,
+	/* .compact = */ generic_index_compact,
+	/* .reset_stat = */ generic_index_reset_stat,
+	/* .begin_build = */ memtx_tree_index_begin_build,
+	/* .reserve = */ memtx_tree_index_reserve,
+	/* .build_next = */ memtx_multikey_tree_index_build_next,
+	/* .end_build = */ memtx_tree_index_end_build,
+};
+
 struct index *
 memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def)
 {
@@ -812,8 +978,11 @@ memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def)
 			 "malloc", "struct memtx_tree_index");
 		return NULL;
 	}
+	const struct index_vtab *vtab = def->key_def->has_multikey_parts ?
+					&memtx_multikey_tree_index_vtab :
+					&memtx_tree_index_vtab;
 	if (index_create(&index->base, (struct engine *)memtx,
-			 &memtx_tree_index_vtab, def) != 0) {
+			 vtab, def) != 0) {
 		free(index);
 		return NULL;
 	}
diff --git a/src/box/tuple.c b/src/box/tuple.c
index 7f06d4053..c8a938c4c 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -455,7 +455,8 @@ tuple_field_go_to_key(const char **field, const char *key, int len)
 }
 
 int
-tuple_go_to_path(const char **data, const char *path, uint32_t path_len)
+tuple_go_to_path_multikey(const char **data, const char *path,
+			  uint32_t path_len, uint64_t multikey_idx)
 {
 	int rc;
 	struct json_lexer lexer;
@@ -463,6 +464,8 @@ tuple_go_to_path(const char **data, const char *path, uint32_t path_len)
 	json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
 	while ((rc = json_lexer_next_token(&lexer, &token)) == 0) {
 		switch (token.type) {
+		case JSON_TOKEN_ANY:
+			token.num = (int)multikey_idx;
 		case JSON_TOKEN_NUM:
 			rc = tuple_field_go_to_index(data, token.num);
 			break;
@@ -532,7 +535,8 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
 	}
 	return tuple_field_raw_by_path(format, tuple, field_map, fieldno,
 				       path + lexer.offset,
-				       path_len - lexer.offset, NULL);
+				       path_len - lexer.offset, NULL,
+				       INVALID_MULTIKEY_INDEX);
 }
 
 /* }}} tuple_field_* getters */
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 8b12fd5a8..e498e1e41 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -45,6 +45,8 @@ struct slab_arena;
 struct quota;
 struct key_part;
 
+#define INVALID_MULTIKEY_INDEX UINT64_MAX
+
 /**
  * A format for standalone tuples allocated on runtime arena.
  * \sa tuple_new().
@@ -286,6 +288,7 @@ box_tuple_upsert(const box_tuple_t *tuple, const char *expr, const
 
 /** \endcond public */
 
+
 /**
  * An atom of Tarantool storage. Represents MsgPack Array.
  * Tuple has the following structure:
@@ -296,7 +299,9 @@ box_tuple_upsert(const box_tuple_t *tuple, const char *expr, const
  * |                                            ^
  * +---------------------------------------data_offset
  *
- * Each 'off_i' is the offset to the i-th indexed field.
+ * Each 'off_i' is the offset to the i-th indexed field or field
+ * map extention (described with struct tuple_field_map_ext)
+ * offset (in sizeof(field_map[0]) units).
  */
 struct PACKED tuple
 {
@@ -328,6 +333,26 @@ struct PACKED tuple
 	 */
 };
 
+/**
+ * Tuple field map extent. Is allocated on tuple_field_map_create
+ * call automatically when necessary, when tuple field map must be
+ * reallocated dynamically.
+ */
+struct field_map_ext {
+	/* Sequence of data offset slots. */
+	uint32_t field_map[1];
+	/* The count of field_map items. */
+	uint32_t size;
+};
+
+static inline struct field_map_ext *
+tuple_field_map_ext(uint32_t *field_map, int32_t root_offset_slot)
+{
+	uint32_t slot_off = field_map[root_offset_slot];
+	return (struct field_map_ext *)((char *)field_map -
+		slot_off * sizeof(uint32_t) - sizeof(struct field_map_ext));
+}
+
 /** Size of the tuple including size of struct tuple. */
 static inline size_t
 tuple_size(const struct tuple *tuple)
@@ -508,11 +533,32 @@ tuple_field_count(const struct tuple *tuple)
  *                      with NULL.
  * @param path The path to process.
  * @param path_len The length of the @path.
+ * @param multikey_index The multikey index hint - index of item
+ *                       for JSON_TOKEN_ANY level.
  * @retval 0 On success.
  * @retval -1 In case of error in JSON path.
  */
 int
-tuple_go_to_path(const char **data, const char *path, uint32_t path_len);
+tuple_go_to_path_multikey(const char **data, const char *path,
+			  uint32_t path_len, uint64_t multikey_idx);
+
+/**
+ * Retrieve msgpack data by JSON path.
+ * @param data[in, out] Pointer to msgpack with data.
+ *                      If the field cannot be retrieved be the
+ *                      specified path @path, it is overwritten
+ *                      with NULL.
+ * @param path The path to process.
+ * @param path_len The length of the @path.
+ * @retval 0 On success.
+ * @retval -1 In case of error in JSON path.
+ */
+static inline int
+tuple_go_to_path(const char **data, const char *path, uint32_t path_len)
+{
+	return tuple_go_to_path_multikey(data, path, path_len,
+					 INVALID_MULTIKEY_INDEX);
+}
 
 /**
  * Get tuple field by field index and relative JSON path.
@@ -533,7 +579,7 @@ static inline const char *
 tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			const uint32_t *field_map, uint32_t fieldno,
 			const char *path, uint32_t path_len,
-			int32_t *offset_slot_hint)
+			int32_t *offset_slot_hint, uint64_t multikey_idx)
 {
 	int32_t offset_slot;
 	if (offset_slot_hint != NULL &&
@@ -558,10 +604,22 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 		if (offset_slot_hint != NULL)
 			*offset_slot_hint = offset_slot;
 offset_slot_access:
-		/* Indexed field */
-		if (field_map[offset_slot] == 0)
-			return NULL;
-		tuple += field_map[offset_slot];
+		if (multikey_idx != INVALID_MULTIKEY_INDEX) {
+			struct field_map_ext *field_map_ext =
+				tuple_field_map_ext((uint32_t *)field_map,
+						    offset_slot);
+			if (multikey_idx >= field_map_ext->size)
+				return NULL;
+			uint32_t off = field_map_ext->field_map[-multikey_idx];
+			if (off == 0)
+				return NULL;
+			tuple += off;
+		} else {
+			/* Indexed field */
+			if (field_map[offset_slot] == 0)
+				return NULL;
+			tuple += field_map[offset_slot];
+		}
 	} else {
 		uint32_t field_count;
 parse:
@@ -572,8 +630,8 @@ parse:
 		for (uint32_t k = 0; k < fieldno; k++)
 			mp_next(&tuple);
 		if (path != NULL &&
-		    unlikely(tuple_go_to_path(&tuple, path,
-						    path_len) != 0))
+		    unlikely(tuple_go_to_path_multikey(&tuple, path, path_len,
+						       multikey_idx) != 0))
 			return NULL;
 	}
 	return tuple;
@@ -595,7 +653,7 @@ tuple_field_raw(struct tuple_format *format, const char *tuple,
 		const uint32_t *field_map, uint32_t field_no)
 {
 	return tuple_field_raw_by_path(format, tuple, field_map, field_no,
-				       NULL, 0, NULL);
+				       NULL, 0, NULL, INVALID_MULTIKEY_INDEX);
 }
 
 /**
@@ -634,16 +692,18 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
 			     uint32_t path_len, uint32_t path_hash);
 
 /**
- * Get a tuple field pointed to by an index part.
+ * Get a tuple field pointed to by an index part and multikey hint.
  * @param format Tuple format.
  * @param tuple A pointer to MessagePack array.
  * @param field_map A pointer to the LAST element of field map.
+ * @param multikey_idx A multikey hint.
  * @param part Index part to use.
  * @retval Field data if the field exists or NULL.
  */
 static inline const char *
-tuple_field_raw_by_part(struct tuple_format *format, const char *data,
-			const uint32_t *field_map, struct key_part *part)
+tuple_field_raw_by_part_multikey(struct tuple_format *format, const char *data,
+				 const uint32_t *field_map,
+				 struct key_part *part, uint64_t multikey_idx)
 {
 	if (unlikely(part->format_epoch != format->epoch)) {
 		assert(format->epoch != 0);
@@ -656,7 +716,41 @@ tuple_field_raw_by_part(struct tuple_format *format, const char *data,
 	}
 	return tuple_field_raw_by_path(format, data, field_map, part->fieldno,
 				       part->path, part->path_len,
-				       &part->offset_slot_cache);
+				       &part->offset_slot_cache, multikey_idx);
+}
+
+/**
+ * Get a field refereed by index @part and multikey hint in tuple.
+ * @param tuple Tuple to get the field from.
+ * @param part Index part to use.
+ * @param multikey_idx A multikey hint.
+ * @retval Field data if the field exists or NULL.
+ */
+static inline const char *
+tuple_field_by_part_multikey(const struct tuple *tuple, struct key_part *part,
+			     uint64_t multikey_idx)
+{
+	return tuple_field_raw_by_part_multikey(tuple_format(tuple),
+						tuple_data(tuple),
+						tuple_field_map(tuple), part,
+						multikey_idx);
+}
+
+
+/**
+ * Get a tuple field pointed to by an index part.
+ * @param format Tuple format.
+ * @param tuple A pointer to MessagePack array.
+ * @param field_map A pointer to the LAST element of field map.
+ * @param part Index part to use.
+ * @retval Field data if the field exists or NULL.
+ */
+static inline const char *
+tuple_field_raw_by_part(struct tuple_format *format, const char *data,
+			const uint32_t *field_map, struct key_part *part)
+{
+	return tuple_field_raw_by_part_multikey(format, data, field_map,
+						part, INVALID_MULTIKEY_INDEX);
 }
 
 /**
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 86922c9bb..cdb08f8fe 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -351,6 +351,8 @@ key_def_set_hint_func(struct key_def *def)
 {
 	def->key_hint = NULL;
 	def->tuple_hint = NULL;
+	if (def->has_multikey_parts)
+		return;
 	bool is_nullable = key_part_is_nullable(def->parts);
 	switch (def->parts->type) {
 	case FIELD_TYPE_UNSIGNED: {
@@ -835,7 +837,8 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	return i;
 }
 
-template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths,
+	 bool is_multikey>
 static inline int
 tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
 		uint64_t tuple_a_hint, const struct tuple *tuple_b,
@@ -845,8 +848,10 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
+	assert(is_multikey == key_def->has_multikey_parts);
+	assert(!is_multikey || is_multikey == has_json_paths);
 	int rc = 0;
-	if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
+	if (!is_multikey && (rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
 		return rc;
 	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
@@ -889,7 +894,14 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
 		end = part + key_def->part_count;
 
 	for (; part < end; part++) {
-		if (has_json_paths) {
+		if (is_multikey) {
+			field_a = tuple_field_raw_by_part_multikey(format_a,
+					tuple_a_raw, field_map_a, part,
+					tuple_a_hint);
+			field_b = tuple_field_raw_by_part_multikey(format_b,
+					tuple_b_raw, field_map_b, part,
+					tuple_b_hint);
+		} else if (has_json_paths) {
 			field_a = tuple_field_raw_by_part(format_a, tuple_a_raw,
 							  field_map_a, part);
 			field_b = tuple_field_raw_by_part(format_b, tuple_b_raw,
@@ -946,7 +958,14 @@ tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
 	 */
 	end = key_def->parts + key_def->part_count;
 	for (; part < end; ++part) {
-		if (has_json_paths) {
+		if (is_multikey) {
+			field_a = tuple_field_raw_by_part_multikey(format_a,
+					tuple_a_raw, field_map_a, part,
+					tuple_a_hint);
+			field_b = tuple_field_raw_by_part_multikey(format_b,
+					tuple_b_raw, field_map_b, part,
+					tuple_b_hint);
+		} else if (has_json_paths) {
 			field_a = tuple_field_raw_by_part(format_a, tuple_a_raw,
 							  field_map_a, part);
 			field_b = tuple_field_raw_by_part(format_b, tuple_b_raw,
@@ -976,24 +995,28 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		       struct key_def *key_def)
 {
 	return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts,
-			has_json_paths>(tuple_a, HINT_INVALID, tuple_b,
+			has_json_paths, false>(tuple_a, HINT_INVALID, tuple_b,
 					HINT_INVALID, key_def);
 }
 
-template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths,
+	 bool is_multikey>
 static inline int
 tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
 		uint64_t tuple_hint, const char *key, uint32_t part_count,
 		uint64_t key_hint, struct key_def *key_def)
 {
+	(void)key_hint;
 	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key != NULL || part_count == 0);
 	assert(part_count <= key_def->part_count);
+	assert(is_multikey == key_def->has_multikey_parts);
+	assert(!is_multikey || is_multikey == has_json_paths);
 	int rc = 0;
-	if ((rc = hint_cmp(tuple_hint, key_hint)) != 0)
+	if (!is_multikey && (rc = hint_cmp(tuple_hint, key_hint)) != 0)
 		return rc;
 	struct key_part *part = key_def->parts;
 	struct tuple_format *format = tuple_format(tuple);
@@ -1002,7 +1025,10 @@ tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
 	enum mp_type a_type, b_type;
 	if (likely(part_count == 1)) {
 		const char *field;
-		if (has_json_paths) {
+		if (is_multikey) {
+			field = tuple_field_raw_by_part_multikey(format,
+					tuple_raw, field_map, part, tuple_hint);
+		} else if (has_json_paths) {
 			field = tuple_field_raw_by_part(format, tuple_raw,
 							field_map, part);
 		} else {
@@ -1032,7 +1058,10 @@ tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
 	struct key_part *end = part + part_count;
 	for (; part < end; ++part, mp_next(&key)) {
 		const char *field;
-		if (has_json_paths) {
+		if (is_multikey) {
+			field = tuple_field_raw_by_part_multikey(format,
+					tuple_raw, field_map, part, tuple_hint);
+		} else if (has_json_paths) {
 			field = tuple_field_raw_by_part(format, tuple_raw,
 							field_map, part);
 		} else {
@@ -1074,7 +1103,7 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 				uint32_t part_count, struct key_def *key_def)
 {
 	return tuple_compare_with_key_slowpath_hinted<is_nullable,
-			has_optional_parts, has_json_paths>(tuple,
+			has_optional_parts, has_json_paths, false>(tuple,
 					HINT_INVALID, key, part_count,
 					HINT_INVALID, key_def);
 }
@@ -1508,14 +1537,22 @@ static const tuple_compare_t compare_slowpath_funcs[] = {
 };
 
 static const tuple_compare_hinted_t compare_slowpath_hinted_funcs[] = {
-	tuple_compare_slowpath_hinted<false, false, false>,
-	tuple_compare_slowpath_hinted<true, false, false>,
-	tuple_compare_slowpath_hinted<false, true, false>,
-	tuple_compare_slowpath_hinted<true, true, false>,
-	tuple_compare_slowpath_hinted<false, false, true>,
-	tuple_compare_slowpath_hinted<true, false, true>,
-	tuple_compare_slowpath_hinted<false, true, true>,
-	tuple_compare_slowpath_hinted<true, true, true>
+	tuple_compare_slowpath_hinted<false, false, false, false>,
+	tuple_compare_slowpath_hinted<true, false, false, false>,
+	tuple_compare_slowpath_hinted<false, true, false, false>,
+	tuple_compare_slowpath_hinted<true, true, false, false>,
+	tuple_compare_slowpath_hinted<false, false, true, false>,
+	tuple_compare_slowpath_hinted<true, false, true, false>,
+	tuple_compare_slowpath_hinted<false, true, true, false>,
+	tuple_compare_slowpath_hinted<true, true, true, false>,
+	tuple_compare_slowpath_hinted<false, false, false, true>,
+	tuple_compare_slowpath_hinted<true, false, false, true>,
+	tuple_compare_slowpath_hinted<false, true, false, true>,
+	tuple_compare_slowpath_hinted<true, true, false, true>,
+	tuple_compare_slowpath_hinted<false, false, true, true>,
+	tuple_compare_slowpath_hinted<true, false, true, true>,
+	tuple_compare_slowpath_hinted<false, true, true, true>,
+	tuple_compare_slowpath_hinted<true, true, true, true>,
 };
 
 void
@@ -1523,7 +1560,14 @@ key_def_set_tuple_compare(struct key_def *def)
 {
 	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
 			   2 * (def->has_optional_parts ? 1 : 0) +
-			   4 * (def->has_json_paths ? 1 : 0);
+			   4 * (def->has_json_paths ? 1 : 0) +
+			   8 * (def->has_multikey_parts ? 1 : 0);
+	if (def->has_multikey_parts) {
+		def->tuple_compare = NULL;
+		def->tuple_compare_hinted =
+			compare_slowpath_hinted_funcs[cmp_func_idx];
+		return;
+	}
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts) {
@@ -1795,14 +1839,22 @@ static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
 
 static const tuple_compare_with_key_hinted_t
 			compare_with_key_slowpath_hinted_funcs[] = {
-	tuple_compare_with_key_slowpath_hinted<false, false, false>,
-	tuple_compare_with_key_slowpath_hinted<true, false, false>,
-	tuple_compare_with_key_slowpath_hinted<false, true, false>,
-	tuple_compare_with_key_slowpath_hinted<true, true, false>,
-	tuple_compare_with_key_slowpath_hinted<false, false, true>,
-	tuple_compare_with_key_slowpath_hinted<true, false, true>,
-	tuple_compare_with_key_slowpath_hinted<false, true, true>,
-	tuple_compare_with_key_slowpath_hinted<true, true, true>
+	tuple_compare_with_key_slowpath_hinted<false, false, false, false>,
+	tuple_compare_with_key_slowpath_hinted<true, false, false, false>,
+	tuple_compare_with_key_slowpath_hinted<false, true, false, false>,
+	tuple_compare_with_key_slowpath_hinted<true, true, false, false>,
+	tuple_compare_with_key_slowpath_hinted<false, false, true, false>,
+	tuple_compare_with_key_slowpath_hinted<true, false, true, false>,
+	tuple_compare_with_key_slowpath_hinted<false, true, true, false>,
+	tuple_compare_with_key_slowpath_hinted<true, true, true, false>,
+	tuple_compare_with_key_slowpath_hinted<false, false, false, true>,
+	tuple_compare_with_key_slowpath_hinted<true, false, false, true>,
+	tuple_compare_with_key_slowpath_hinted<false, true, false, true>,
+	tuple_compare_with_key_slowpath_hinted<true, true, false, true>,
+	tuple_compare_with_key_slowpath_hinted<false, false, true, true>,
+	tuple_compare_with_key_slowpath_hinted<true, false, true, true>,
+	tuple_compare_with_key_slowpath_hinted<false, true, true, true>,
+	tuple_compare_with_key_slowpath_hinted<true, true, true, true>
 };
 
 void
@@ -1810,7 +1862,14 @@ key_def_set_tuple_compare_with_key(struct key_def *def)
 {
 	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
 			   2 * (def->has_optional_parts ? 1 : 0) +
-			   4 * (def->has_json_paths ? 1 : 0);
+			   4 * (def->has_json_paths ? 1 : 0) +
+			   8 * (def->has_multikey_parts ? 1 : 0);
+	if (def->has_multikey_parts) {
+		def->tuple_compare_with_key = NULL;
+		def->tuple_compare_with_key_hinted =
+			compare_with_key_slowpath_hinted_funcs[cmp_func_idx];
+		return;
+	}
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts) {
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 4439ce3e0..99b602cd5 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -33,6 +33,7 @@
 #include "json/json.h"
 #include "tuple_format.h"
 #include "coll_id_cache.h"
+#include "tuple.h"
 
 #include "third_party/PMurHash.h"
 
@@ -233,14 +234,19 @@ tuple_format_add_field(struct tuple_format *format, uint32_t fieldno,
 	json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
 	while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 &&
 	       field->token.type != JSON_TOKEN_END) {
-		if (field->token.type == JSON_TOKEN_ANY) {
-			diag_set(ClientError, ER_UNSUPPORTED,
-				"Tarantool", "multikey indexes");
-			goto fail;
-		}
 		enum field_type expected_type =
 			field->token.type == JSON_TOKEN_STR ?
 			FIELD_TYPE_MAP : FIELD_TYPE_ARRAY;
+		if (field->token.type == JSON_TOKEN_ANY &&
+		    !json_token_is_multikey(&parent->token) &&
+		    !json_token_is_leaf(&parent->token)) {
+			assert(expected_type == FIELD_TYPE_ARRAY);
+			diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
+				 tuple_field_path(parent),
+				 field_type_strs[parent->type],
+				 "multikey array");
+			goto fail;
+		}
 		if (field_type1_contains_type2(parent->type, expected_type)) {
 			parent->type = expected_type;
 		} else {
@@ -475,9 +481,8 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 					break;
 				format->min_tuple_size += mp_sizeof_nil();
 			}
-		} else {
+		} else if (field->token.type == JSON_TOKEN_STR) {
 			/* Account a key string for map member. */
-			assert(field->token.type == JSON_TOKEN_STR);
 			format->min_tuple_size +=
 				mp_sizeof_str(field->token.len);
 		}
@@ -805,6 +810,59 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1,
 	return true;
 }
 
+/**
+ * Grow tuple_field_map allocation left with extent_size
+ * 0 bytes.
+ */
+static int
+tuple_field_map_realloc(uint32_t **field_map, uint32_t *field_map_size,
+			uint32_t extent_size, struct region *region)
+{
+	assert(extent_size % sizeof(uint32_t) == 0);
+	uint32_t new_field_map_size = *field_map_size + extent_size;
+	uint32_t *new_field_map = region_alloc(region, new_field_map_size);
+	if (new_field_map == NULL) {
+		diag_set(OutOfMemory, new_field_map_size, "region_alloc",
+			 "new_field_map");
+		return -1;
+	}
+	memset(new_field_map, 0, extent_size);
+	if (*field_map != NULL) {
+		memcpy((char *)new_field_map + extent_size,
+		       (char *)*field_map - *field_map_size, *field_map_size);
+	}
+	*field_map = (uint32_t *)((char *)new_field_map + new_field_map_size);
+	*field_map_size = new_field_map_size;
+	return 0;
+}
+
+/**
+ * Retrieve tuple field map extent by root_offset_slot, reallocate
+ * memory if required.
+ */
+struct field_map_ext *
+tuple_field_map_ext_retrieve(uint32_t **field_map, uint32_t *field_map_size,
+			     int32_t root_offset_slot, uint32_t extent_items,
+			     struct region *region)
+{
+	uint32_t extent_sz = sizeof(struct field_map_ext) +
+			     extent_items * sizeof(uint32_t);
+	uint32_t slot_off = (*field_map)[root_offset_slot];
+	if (slot_off == 0) {
+		/* Field map extent was not created yet. */
+		slot_off = *field_map_size / sizeof(uint32_t);
+		(*field_map)[root_offset_slot] = slot_off;
+		if (tuple_field_map_realloc(field_map, field_map_size,
+					    extent_sz, region) != 0)
+			return NULL;
+	}
+	struct field_map_ext *field_map_ext =
+		tuple_field_map_ext(*field_map, root_offset_slot);
+	assert(field_map_ext->size == 0 || field_map_ext->size == extent_items);
+	field_map_ext->size = extent_items;
+	return field_map_ext;
+}
+
 /** @sa declaration for details. */
 int
 tuple_field_map_create(struct tuple_format *format, const char *tuple,
@@ -817,14 +875,11 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
 		return 0; /* Nothing to initialize */
 	}
 	struct region *region = &fiber()->gc;
-	*field_map_size = format->field_map_size;
-	*field_map = region_alloc(region, *field_map_size);
-	if (*field_map == NULL) {
-		diag_set(OutOfMemory, *field_map_size, "region_alloc",
-			 "field_map");
+	*field_map = NULL;
+	*field_map_size = 0;
+	if (tuple_field_map_realloc(field_map, field_map_size,
+				    format->field_map_size, region) != 0)
 		return -1;
-	}
-	*field_map = (uint32_t *)((char *)*field_map + *field_map_size);
 
 	const char *pos = tuple;
 	int rc = 0;
@@ -864,11 +919,6 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
 	uint32_t defined_field_count = MIN(field_count, validate ?
 					   tuple_format_field_count(format) :
 					   format->index_field_count);
-	/*
-	 * Nullify field map to be able to detect by 0,
-	 * which key fields are absent in tuple_field().
-	 */
-	memset((char *)*field_map - *field_map_size, 0, *field_map_size);
 	/*
 	 * Prepare mp stack of the size equal to the maximum depth
 	 * of the indexed field in the format::fields tree
@@ -885,6 +935,12 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
 	struct mp_stack stack;
 	mp_stack_create(&stack, format->fields_depth, frames);
 	mp_stack_push(&stack, MP_ARRAY, defined_field_count);
+	/**
+	 * Pointer to the stack entry that represents the multikey
+	 * index root ARRAY entry.
+	 */
+	struct mp_frame *mk_parent_frame = NULL;
+
 	struct tuple_field *field;
 	struct json_token *parent = &format->fields.root;
 	while (true) {
@@ -901,6 +957,10 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
 			mp_stack_pop(&stack);
 			if (mp_stack_is_empty(&stack))
 				goto finish;
+			if (json_token_is_multikey(parent)) {
+				/* Leave multikey index branch. */
+				mk_parent_frame = NULL;
+			}
 			parent = parent->parent;
 		}
 		/*
@@ -950,8 +1010,23 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
 					 field_type_strs[field->type]);
 				goto error;
 			}
-			if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+			if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL &&
+			    mk_parent_frame == NULL) {
 				(*field_map)[field->offset_slot] = pos - tuple;
+			} else if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL &&
+				   mk_parent_frame != NULL) {
+				uint32_t extent_items = mk_parent_frame->count;
+				struct field_map_ext *field_map_ext =
+					tuple_field_map_ext_retrieve(field_map,
+							field_map_size,
+							field->offset_slot,
+							extent_items, region);
+				if (field_map_ext == NULL)
+					goto error;
+				int multikey_idx = mk_parent_frame->curr;
+				field_map_ext->field_map[
+					-multikey_idx] = pos - tuple;
+			}
 			if (required_fields != NULL)
 				bit_clear(required_fields, field->id);
 		}
@@ -968,6 +1043,11 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple,
 					mp_decode_array(&pos) :
 					mp_decode_map(&pos);
 			mp_stack_push(&stack, type, size);
+			if (json_token_is_multikey(&field->token)) {
+				/* Track multikey entry frame. */
+				assert(type == MP_ARRAY);
+				mk_parent_frame = &frames[stack.used - 1];
+			}
 			parent = &field->token;
 		} else {
 			mp_next(&pos);
diff --git a/test/engine/json.result b/test/engine/json.result
index a790cdfbc..5c7a9b3b3 100644
--- a/test/engine/json.result
+++ b/test/engine/json.result
@@ -691,7 +691,85 @@ s = box.schema.space.create('withdata')
 ...
 idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}})
 ---
-- error: Tarantool does not support multikey indexes
+...
+s:insert({1, 1, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}})
+---
+- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({1, 1, {{fname='Ivan', sname='Ivanych'}}})
+---
+- [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+...
+idx:select({'James', 'Bond'})
+---
+- - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+idx:select({'Kirill', 'Shcherbatov'})
+---
+- []
+...
+idx:select({'Ivan', 'Ivanych'})
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+...
+idx:select({'Vasya', 'Pupkin'})
+---
+- - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+idx:select()
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+  - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+  - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+---
+- error: Duplicate key exists in unique index 'idx' in space 'withdata'
+...
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+---
+- error: Duplicate key exists in unique index 'idx' in space 'withdata'
+...
+idx:select()
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+  - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+  - [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+idx:delete({'Vasya', 'Pupkin'})
+---
+- [1, 1, [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+---
+- [2, 1, [{'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+---
+- [2, 1, [{'fname': 'James', 'sname': 'Bond'}]]
+...
+idx:select()
+---
+- - [1, 1, [{'fname': 'Ivan', 'sname': 'Ivanych'}]]
+  - [2, 1, [{'fname': 'James', 'sname': 'Bond'}]]
+  - [2, 1, [{'fname': 'Vasya', 'sname': 'Pupkin'}]]
+...
+s:drop()
+---
+...
+s = box.schema.space.create('withdata')
+---
+...
+pk = s:create_index('pk')
+---
+...
+idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}})
+---
+...
+s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}})
+---
+- [1, 1, [{'fname': 'A1', 'extra': {'sname': 'A2'}}, {'fname': 'B1'}, {'fname': 'C1',
+      'extra': {'sname': 'C2'}}]]
 ...
 s:drop()
 ---
diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
index f9a7180b1..dc6016af9 100644
--- a/test/engine/json.test.lua
+++ b/test/engine/json.test.lua
@@ -198,4 +198,24 @@ s:drop()
 --
 s = box.schema.space.create('withdata')
 idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}})
+s:insert({1, 1, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}})
+s:insert({1, 1, {{fname='Ivan', sname='Ivanych'}}})
+idx:select({'James', 'Bond'})
+idx:select({'Kirill', 'Shcherbatov'})
+idx:select({'Ivan', 'Ivanych'})
+idx:select({'Vasya', 'Pupkin'})
+idx:select()
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+idx:select()
+idx:delete({'Vasya', 'Pupkin'})
+s:insert({2, 1, {{fname='Vasya', sname='Pupkin'}}})
+s:insert({2, 1, {{fname='James', sname='Bond'}}})
+idx:select()
+s:drop()
+
+s = box.schema.space.create('withdata')
+pk = s:create_index('pk')
+idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}})
+s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}})
 s:drop()
-- 
2.21.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
  2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
@ 2019-03-14  7:04   ` Vladimir Davydov
  2019-03-15 10:20     ` [tarantool-patches] " Kirill Shcherbatov
  0 siblings, 1 reply; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-14  7:04 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Wed, Mar 13, 2019 at 03:15:37PM +0300, Kirill Shcherbatov wrote:
> @@ -1326,6 +1340,6 @@ tuple_compare_with_key_create(const struct key_def *def)
>  void
>  key_def_set_compare_func(struct key_def *def)
>  {
> -	def->tuple_compare = tuple_compare_create(def);
> -	def->tuple_compare_with_key = tuple_compare_with_key_create(def);
> +	key_def_set_tuple_compare(def);
> +	key_def_set_tuple_compare_with_key(def);
>  }

The two functions share a lot of code - all branching is basically the
same. I think we better merge them instead.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 2/3] memtx: introduce tuple compare hint
  2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
@ 2019-03-14  8:19   ` Vladimir Davydov
  2019-03-15 10:20     ` [tarantool-patches] " Kirill Shcherbatov
  0 siblings, 1 reply; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-14  8:19 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Wed, Mar 13, 2019 at 03:15:38PM +0300, Kirill Shcherbatov wrote:
> Implement functions for retrieving tuple hints for a particular
> key_def. Hint is an integer that can be used for tuple comparison
> optimization: if a hint of one tuple is less than a hint of another
> then the first tuple is definitely less than the second; only if
> hints are equal tuple_compare must be called for getting comparison
> result.
> 
> Hints are calculated using only the first part of key_def.
> 
> Hints are only useful when:
>  * they are precalculated and stored along with the tuple;
> calculation is not cheap (cheaper than tuple_compare btw) but
> precalculated hints allow to compare tuples without even fetching
> tuple data.
>  * first part of key_def is 'string'(for now)
>  * since hint is calculated using only the first part of key_def
> (and only first several characters if it is a string) this part
> must be different with high probability for every tuple pair.

Please refresh the comment. It isn't quite correct.

> 
> Closes #3961
> ---
>  src/box/key_def.h        |  95 +++++++
>  src/box/memtx_tree.c     |  46 ++-
>  src/box/tuple_compare.cc | 584 +++++++++++++++++++++++++++++++++++++--
>  src/lib/coll/coll.c      |  33 +++
>  src/lib/coll/coll.h      |   4 +
>  5 files changed, 730 insertions(+), 32 deletions(-)

Please add tests checking that changing an indexed field type from
scalar to a basic type and vice versa without rebuilding the index
doesn't break ordering.

> 
> diff --git a/src/box/key_def.h b/src/box/key_def.h
> index dd62f6a35..0d8f3112e 100644
> --- a/src/box/key_def.h
> +++ b/src/box/key_def.h
> @@ -137,6 +137,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
>  typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
>  			       const struct tuple *tuple_b,
>  			       struct key_def *key_def);
> +/** @copydoc tuple_compare_with_key_hinted() */
> +typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
> +					       uint64_t tuple_hint,
> +					       const char *key,
> +					       uint32_t part_count,
> +					       uint64_t key_hint,
> +					       struct key_def *key_def);

On the second thought, we might want to reduce the hint size or increase
it in future. Would it be better to introduce a helper type for hints so
that we could easily do this if we need to?

  typedef uint64_t tuple_hint_t;

What do you think? Would it be worthwhile? Would it look OK?

Sorry, I asked you to use uint64_t before. Looks to me now that
I was wrong... :-(

> @@ -571,6 +597,51 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
>  	return key_def->tuple_compare_with_key(tuple, key, part_count, key_def);
>  }
>  
> +/**
> + * Compare tuples using the key definition and comparison hints.
> + * @param tuple_a First tuple.
> + * @param tuple_a_hint Comparison hint is calculated for the
> + *                     @a tuple_a.

Should be "Comparison hint of @a tuple_a"

Please fix here and everywhere else where appropriate.

> + * @param tuple_b Second tuple.
> + * @param tuple_b_hint Comparison hint is calculated for the
> + *                     @a tuple_b.
> + * @param key_def Key definition.
> + * @retval 0  if key_fields(tuple_a) == key_fields(tuple_b)
> + * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b)
> + * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b)
> + */
> +static inline int
> +tuple_compare_hinted(const struct tuple *tuple_a, uint64_t tuple_a_hint,
> +		     const struct tuple *tuple_b, uint64_t tuple_b_hint,
> +		     struct key_def *key_def)
> +{
> +	return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
> +					     tuple_b_hint, key_def);
> +}
> +
> +/**
> + * Compare tuple with key using the key definition and
> + * comparison hints.
> + * @param tuple tuple
> + * @param tuple_hint Comparison hint is calculated for @a tuple.
> + * @param key key parts without MessagePack array header
> + * @param part_count the number of parts in @a key

This is nitpicking, of course, but please be consistent. If you put dot
(.) at the end of a sentence, put it everywhere.

> + * @param key_hint t Comparison key kent is calculated for @a key.

Comparison hint of @a key.

> + * @param key_def key definition
> + * @retval 0  if key_fields(tuple) == parts(key)
> + * @retval <0 if key_fields(tuple) < parts(key)
> + * @retval >0 if key_fields(tuple) > parts(key)
> + */
> +static inline int
> +tuple_compare_with_key_hinted(const struct tuple *tuple, uint64_t tuple_hint,
> +			      const char *key, uint32_t part_count,
> +			      uint64_t key_hint, struct key_def *key_def)
> +{
> +	return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
> +						      part_count, key_hint,
> +						      key_def);
> +}
> +
>  /**
>   * Compute hash of a tuple field.
>   * @param ph1 - pointer to running hash
> @@ -624,6 +695,30 @@ key_hash(const char *key, struct key_def *key_def)
>  	return key_def->key_hash(key, key_def);
>  }
>  
> + /*
> + * Get a comparison hint for a @a tuple.

The article before 'tuple' ('a') is redundant.

> + * @param tuple - tuple to get uint64_t of.

tuple to compute the hint for.

> + * @param key_def - key_def that defines which comparison is used.

key definition used for tuple comparison.

> + * @return the comparison auxiliary information.

the hint.

> + */
> +static inline uint64_t
> +tuple_hint(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	return key_def->tuple_hint(tuple, key_def);
> +}
> +
> +/**
> + * Get a comparison hint of a @a key.
> + * @param key - key to get hint of.
> + * @param key_def - key_def that defines which comparison is used.
> + * @return the comparison auxiliary information.

Ditto.

> + */
> +static inline uint64_t
> +key_hint(const char *key, struct key_def *key_def)
> +{
> +	return key_def->key_hint(key, key_def);
> +}
> +
>  #if defined(__cplusplus)
>  } /* extern "C" */
>  #endif /* defined(__cplusplus) */
> diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
> index d5b42efb6..2c1933ceb 100644
> --- a/src/box/memtx_tree.c
> +++ b/src/box/memtx_tree.c
> @@ -47,6 +47,11 @@ struct memtx_tree_key_data {
>  	const char *key;
>  	/** Number of msgpacked search fields. */
>  	uint32_t part_count;
> +	/**
> +	 * Comparison hint is calculated with
> +	 * key_hint(key, ...).
> +	 */

/** Comparison hint, see tuple_hint(). */

> +	uint64_t hint;
>  };
>  
>  /**
> @@ -55,6 +60,11 @@ struct memtx_tree_key_data {
>  struct memtx_tree_data {
>  	/* Tuple that this node is represents. */
>  	struct tuple *tuple;
> +	/**
> +	 * Comparison hint is calculated with
> +	 * tuple_hint(tuple, ...).
> +	 */

/** Comparison hint, see key_hint(). */

> +	uint64_t hint;
>  };
>  
>  /**
> @@ -69,16 +79,18 @@ static bool
>  memtx_tree_data_identical(const struct memtx_tree_data *a,
>  			  const struct memtx_tree_data *b)
>  {
> -	return a->tuple == b->tuple;
> +	return a->tuple == b->tuple && a->hint == b->hint;
>  }
>  
>  #define BPS_TREE_NAME memtx_tree
>  #define BPS_TREE_BLOCK_SIZE (512)
>  #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
>  #define BPS_TREE_COMPARE(a, b, arg)\
> -	tuple_compare((&a)->tuple, (&b)->tuple, arg)
> +	tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
> +			     (&b)->hint, arg)

Hmm, (&a)->tuple is equivalent to (a).tuple, isn't it?

> @@ -516,9 +533,11 @@ memtx_tree_index_get(struct index *base, const char *key,
>  	assert(base->def->opts.is_unique &&
>  	       part_count == base->def->key_def->part_count);
>  	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
> +	struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
>  	struct memtx_tree_key_data key_data;
>  	key_data.key = key;
>  	key_data.part_count = part_count;
> +	key_data.hint = key_hint(key, cmp_def);
>  	struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
>  	*result = res != NULL ? res->tuple : NULL;
>  	return 0;
> @@ -530,9 +549,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
>  			 struct tuple **result)
>  {
>  	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
> +	struct key_def *key_def = index->tree.arg;

Accessing index->tree.arg looks ugly - it violates encapsulation.
Besides, right above you use index_def_cmp_def instead. I think we
better introduce a helper function for accessing index->tree.arg
and use it everywhere:

	memtx_tree_cmp_def(&index->tree)

> diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
> index 1bcff2ca3..86922c9bb 100644
> --- a/src/box/tuple_compare.cc
> +++ b/src/box/tuple_compare.cc
> @@ -34,6 +34,383 @@
>  #include "trivia/util.h" /* NOINLINE */
>  #include <math.h>
>  
> +/**
> + * Tuple hints

If you agree to add tuple_hint_t, then please move this comment to
the type declaration. Don't forget to update it when you introduce
multikey indexes.

> + *
> + * Hint is a value h(t) is calculated for tuple in terms
> + * of particular key_def that has the following rules:

Tuple comparison hint h(t) is such a function of tuple t that
the following conditions always hold for any pair of tuples
t1 and t2:

> + * if h(t1) < h(t2) then t1 < t2;
> + * if h(t1) > h(t2) then t1 > t2;
> + * if t1 == t2 then regular comparision is required;

if h(t1) == h(t2) then t1 may or may not be equal to t2.

> + * These rules mean that instead of direct tuple vs tuple
> + * (or tuple vs key) comparison one may compare theirs
> + * hints first; and only if theirs hints equal compare
> + * the tuples themselves.
> + *
> + * The comparison hint has the following structure:
> + * uint64_t: [ CMP_HINT_TYPE | CMP_HINT_VALUE ]
> + *           64              62               0 bit
> + *
> + * The reserved value HINT_INVALID is returned when hint is
> + * undefined.
> + */
> +#define HINT_MAX (((int64_t)1 << 61) - 1)
> +#define HINT_MIN (-HINT_MAX - 1)
> +#define HINT_UMAX (((uint64_t)1 << 62) - 1)

What's HINT_UMAX? Please remove if possible.

> +
> +#define HINT_INVALID UINT64_MAX

HINT_NONE would be a better name IMO, because it would stress on
the fact that the hint is absent.

> +
> +#define HINT_TYPE_NUMBER ((uint8_t)1)
> +#define HINT_TYPE_STRING ((uint8_t)2)
> +#define HINT_TYPE_BOOL   ((uint8_t)3)
> +
> +#define HINT_TYPE(hint) ((uint64_t)hint >> 62)
> +#define HINT_VALUE(hint) ((uint64_t)hint & (((uint64_t)1 << 62) - 1))

Using 62 directly is error-prone (in case we decide to change it).
Please define constants for all numbers you use.

> +#define HINT(type, value) ({ \
> +	assert(HINT_TYPE(value) == 0); \
> +	(((uint64_t)type << 62) | (uint64_t)value);})

Would be better to make these helpers static inline functions.

> +
> +/**
> + * Perform hints comparison.
> + * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a;
> + * If the hints are incompatible or equal, a value of 0 is
> + * returned, and the further comparison of the original data is
> + * required.
> + */
> +static int
> +hint_cmp(uint64_t hint_a, uint64_t hint_b)
> +{
> +	if (hint_a != HINT_INVALID && hint_b != HINT_INVALID &&
> +	    HINT_TYPE(hint_a) == HINT_TYPE(hint_b) &&
> +	    HINT_VALUE(hint_a) != HINT_VALUE(hint_b))
> +		return hint_a < hint_b ? -1 : 1;

Hints must be defined in such a way that we don't need to compare types,
i.e. hint(123) < hint("abc").

I suggest you to take a look at mp_class. May be, we could reuse it for
hint types?

> +	else
> +		return 0;
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_uint(const char *key, struct key_def *key_def)

Please use unsigned/integer for these functions so as not to confuse
them with MP_INT/UINT.

> +{
> +	(void)key_def;
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
> +	uint64_t ret;
> +	if ((is_optional && key == NULL) ||
> +	    (is_nullable && mp_typeof(*key) == MP_NIL))
> +		return HINT(HINT_TYPE_NUMBER, 0);
> +	assert(mp_typeof(*key) == MP_UINT);
> +	uint64_t val = mp_decode_uint(&key);
> +	ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
> +	return HINT(HINT_TYPE_NUMBER, ret);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_uint(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_UNSIGNED);
> +	const char *field = tuple_field_by_part(tuple, key_def->parts);
> +	if (is_nullable && field == NULL)
> +		return HINT(HINT_TYPE_NUMBER, 0);
> +	return key_hint_uint<false, is_nullable>(field, key_def);
> +}

Although I didn't complain about the way you define hint functions
before, I've always had an inkling that something is wrong, that calling
key_hint from tuple_hint looks weird. The 'is_optional' template
argument you added recently has only strengthened my suspicions: it
looks confusing, because in fact it doesn't have anything to do with
optional parts we use in tuple_compare templates.

I propose the following refactoring. Let's introduce field_hint_*
functions (not using templates) that would take only binary data
of a particular type and return a hint, e.g.

	tuple_hint_t
	field_hint_unsigned(const char *field) {
		assert(mp_typeof(*field) == MP_UINT);
		...
	}

Then add two parametrised functions for computing a hint for a key and
a tuple that would look like this:

	template<enum field_type type, ...>
	tuple_hint(const struct tuple *tuple, struct key_def *key_def) {
		switch (type) {
		case FIELD_TYPE_UNSIGNED:
			return field_hint_unsigned(...);
		case FIELD_TYPE_INTEGER:
			return field_hint_integer(...);
		...
		}
	}

	template<enum field_type type, ...>
	key_hint(...) {
		switch (type) {
		case FIELD_TYPE_UNSIGNED:
			return field_hint_unsigned(...);
		case FIELD_TYPE_INTEGER:
			return field_hint_integer(...);
		...
		}
	}

This way all the logic computing hints would be consolidated in
field_hint_* family of functions which would be simple and wouldn't
even take key_def. This would also allow you to get rid of that ugly
switch-case in key_def_set_hint_func.

> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_int(const char *key, struct key_def *key_def)
> +{
> +	(void)key_def;
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_INTEGER);
> +	uint64_t ret;
> +	if ((is_optional && key == NULL) ||
> +	    (is_nullable && mp_typeof(*key) == MP_NIL))
> +		return HINT(HINT_TYPE_NUMBER, 0);
> +	if (mp_typeof(*key) == MP_UINT) {
> +		uint64_t val = mp_decode_uint(&key);
> +		ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
> +	} else {
> +		assert(mp_typeof(*key) == MP_INT);
> +		int64_t val = mp_decode_int(&key);
> +		ret = val > HINT_MAX ? HINT_UMAX :
> +		      val < HINT_MIN ? 0 :
> +		      (uint64_t)val - (uint64_t)HINT_MIN;
> +	}
> +	return HINT(HINT_TYPE_NUMBER, ret);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_int(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_INTEGER);
> +	const char *field = tuple_field_by_part(tuple, key_def->parts);
> +	if (is_nullable && field == NULL)
> +		return HINT(HINT_TYPE_NUMBER, 0);
> +	return key_hint_int<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_number(const char *key, struct key_def *key_def)
> +{
> +	(void)key_def;
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_NUMBER ||
> +	       key_def->parts->type == FIELD_TYPE_SCALAR);
> +	if ((is_optional && key == NULL) ||
> +	    (is_nullable && mp_typeof(*key) == MP_NIL))
> +		return HINT(HINT_TYPE_NUMBER, 0);
> +	uint64_t ret;
> +	switch (mp_typeof(*key)) {
> +	case MP_FLOAT:
> +	case MP_DOUBLE: {
> +		double f = mp_typeof(*key) == MP_FLOAT ?
> +			   mp_decode_float(&key) : mp_decode_double(&key);
> +		if (!isfinite(f))
> +			return HINT_INVALID;
> +		double val;
> +		(void)modf(f, &val);
> +		ret = val > (double)HINT_MAX ? HINT_UMAX :
> +		      val < (double)HINT_MIN ? 0 :

As I told you before, comparing doulbe with int64_t like this is
incorrect:

	#include <stdio.h>
	#include <stdint.h>

	int main()
	{
		int64_t x = (1LL << 61) - 1;
		double y = x;
		printf("x is %lld (int64_t)\n", x);
		printf("y is %lf (double)\n", y);
		printf("y <= (double)x is %s\n",
		       y <= (double)x ? "true" : "false");
		printf("(int64_t)y is %lld\n", (int64_t)y);
	}

prints

	x is 2305843009213693951 (int64_t)
	y is 2305843009213693952.000000 (double)
	y <= (double)x is true
	(int64_t)y is 2305843009213693952

Please fix and don't forget to add a test case.

> +		      (uint64_t)val - (uint64_t)HINT_MIN;
> +		break;
> +	}
> +	case MP_INT: {
> +		int64_t val = (uint64_t)mp_decode_int(&key);
> +		ret = val > HINT_MAX ? HINT_UMAX :
> +		      val < HINT_MIN ? 0 :
> +		      (uint64_t)val - (uint64_t)HINT_MIN;
> +		break;
> +	}
> +	case MP_UINT: {
> +		uint64_t val = mp_decode_uint(&key);
> +		ret = val > HINT_MAX ? HINT_UMAX : val - (uint64_t)HINT_MIN;
> +		break;
> +	}
> +	default:
> +		unreachable();
> +	}
> +	return HINT(HINT_TYPE_NUMBER, ret);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_number(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_NUMBER);
> +	const char *field = tuple_field_by_part(tuple, key_def->parts);
> +	if (is_nullable && field == NULL)
> +		return HINT(HINT_TYPE_NUMBER, 0);
> +	return key_hint_number<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_boolean(const char *key, struct key_def *key_def)
> +{
> +	(void)key_def;
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_BOOLEAN ||
> +	       key_def->parts->type == FIELD_TYPE_SCALAR);
> +	if ((is_optional && key == NULL) ||
> +	    (is_nullable && mp_typeof(*key) == MP_NIL))
> +		return HINT(HINT_TYPE_BOOL, 0);
> +	return (uint64_t)mp_decode_bool(&key);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_boolean(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_BOOLEAN);
> +	const char *field = tuple_field_by_part(tuple, key_def->parts);
> +	if (is_nullable && field == NULL)
> +		return HINT(HINT_TYPE_BOOL, 0);
> +	return key_hint_boolean<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_string(const char *key, struct key_def *key_def)
> +{
> +	(void)key_def;
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->coll == NULL);
> +	assert(key_def->parts->type == FIELD_TYPE_STRING ||
> +	       key_def->parts->type == FIELD_TYPE_SCALAR);
> +	if ((is_optional && key == NULL) ||
> +	    (is_nullable && mp_typeof(*key) == MP_NIL))
> +		return HINT(HINT_TYPE_STRING, 0);
> +	assert(mp_typeof(*key) == MP_STR);
> +	uint32_t len;
> +	const unsigned char *str =
> +		(const unsigned char *)mp_decode_str(&key, &len);
> +	uint64_t result = 0;
> +	uint32_t process_len = MIN(len, 7);
> +	for (uint32_t i = 0; i < process_len; i++) {
> +		result <<= 8;
> +		result |= str[i];
> +	}
> +	result <<= 8 * (7 - process_len);

Please define an appropriate constant for 7.

> +	return HINT(HINT_TYPE_STRING, result);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_string(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->coll == NULL);
> +	assert(key_def->parts->type == FIELD_TYPE_STRING);
> +	const char *field = tuple_field_by_part(tuple, key_def->parts);
> +	if (is_nullable && field == NULL)
> +		return HINT(HINT_TYPE_STRING, 0);
> +	return key_hint_string<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_string_coll(const char *key, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert((key_def->parts->type == FIELD_TYPE_STRING ||
> +	        key_def->parts->type == FIELD_TYPE_SCALAR) &&
> +	        key_def->parts->coll != NULL);
> +	if ((is_optional && key == NULL) ||
> +	    (is_nullable && mp_typeof(*key) == MP_NIL))
> +		return HINT(HINT_TYPE_STRING, 0);
> +	assert(mp_typeof(*key) == MP_STR);
> +	uint32_t len;
> +	const char *str = mp_decode_str(&key, &len);
> +	return key_def->parts->coll->hint(str, len, key_def->parts->coll);
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_string_coll(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_STRING &&
> +	        key_def->parts->coll != NULL);
> +	const char *field = tuple_field_by_part(tuple, key_def->parts);
> +	if (is_nullable && field == NULL)
> +		return HINT(HINT_TYPE_STRING, 0);
> +	return key_hint_string_coll<false, is_nullable>(field, key_def);
> +}
> +
> +template<bool is_optional, bool is_nullable>
> +static uint64_t
> +key_hint_scalar(const char *key, struct key_def *key_def)
> +{
> +	(void)key_def;
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_SCALAR);
> +	if ((is_optional && key == NULL) ||
> +	    (is_nullable && mp_typeof(*key) == MP_NIL))
> +		return HINT(HINT_TYPE_BOOL, 0);
> +	switch (mp_typeof(*key)) {
> +	case MP_INT:
> +	case MP_UINT:
> +	case MP_FLOAT:
> +	case MP_DOUBLE:
> +		return key_hint_number<is_optional, is_nullable>(key, key_def);
> +	case MP_BOOL:
> +		return key_hint_boolean<is_optional, is_nullable>(key, key_def);
> +	case MP_STR:
> +		if (key_def->parts->coll == NULL)
> +			return key_hint_string<is_optional,
> +					is_nullable>(key, key_def);
> +		else
> +			return key_hint_string_coll<is_optional,
> +					is_nullable>(key, key_def);

Please make sure this if() is resolved at compile time. Since we've
chosen the path of using templates for optimization, we should stick
to it till the end.

> +	default:
> +		return HINT_INVALID;

Should be unreachable(). BTW you forgot MP_BIN.

> +	}
> +}
> +
> +template<bool is_nullable>
> +static uint64_t
> +tuple_hint_scalar(const struct tuple *tuple, struct key_def *key_def)
> +{
> +	assert(key_part_is_nullable(key_def->parts) == is_nullable);
> +	assert(key_def->parts->type == FIELD_TYPE_SCALAR);
> +	const char *field = tuple_field_by_part(tuple, key_def->parts);
> +	if (is_nullable && field == NULL)
> +		return HINT(HINT_TYPE_BOOL, 0);
> +	return key_hint_scalar<false, is_nullable>(field, key_def);
> +}
> +
> +void
> +key_def_set_hint_func(struct key_def *def)
> +{
> +	def->key_hint = NULL;
> +	def->tuple_hint = NULL;
> +	bool is_nullable = key_part_is_nullable(def->parts);
> +	switch (def->parts->type) {
> +	case FIELD_TYPE_UNSIGNED: {
> +		def->key_hint = is_nullable ? key_hint_uint<true, true> :
> +					      key_hint_uint<true, false>;
> +		def->tuple_hint = is_nullable ? tuple_hint_uint<true> :
> +						tuple_hint_uint<false>;
> +		break;
> +	}
> +	case FIELD_TYPE_INTEGER: {
> +		def->key_hint = is_nullable ? key_hint_int<true, true> :
> +					      key_hint_int<true, false>;
> +		def->tuple_hint = is_nullable ? tuple_hint_int<true> :
> +						tuple_hint_int<false>;
> +		break;
> +	}
> +	case FIELD_TYPE_STRING: {
> +		if (def->parts->coll != NULL) {
> +			def->key_hint =
> +				is_nullable ? key_hint_string_coll<true, true> :
> +					      key_hint_string_coll<true, false>;
> +			def->tuple_hint =
> +				is_nullable ? tuple_hint_string_coll<true> :
> +					      tuple_hint_string_coll<false>;
> +		} else {
> +			def->key_hint =
> +				is_nullable ? key_hint_string<true, true> :
> +					      key_hint_string<true, false>;
> +			def->tuple_hint = is_nullable ? tuple_hint_string<true> :
> +							tuple_hint_string<false>;
> +		}
> +		break;
> +	}
> +	case FIELD_TYPE_NUMBER: {
> +		def->key_hint = is_nullable ? key_hint_number<true, true> :
> +					      key_hint_number<true, false>;
> +		def->tuple_hint = is_nullable ? tuple_hint_number<true> :
> +						tuple_hint_number<false>;
> +		break;
> +	}
> +	case FIELD_TYPE_BOOLEAN: {
> +		def->key_hint = is_nullable ? key_hint_boolean<true, true> :
> +					      key_hint_boolean<true, false>;
> +		def->tuple_hint = is_nullable ? tuple_hint_boolean<true> :
> +						tuple_hint_boolean<false>;
> +		break;
> +	}
> +	case FIELD_TYPE_SCALAR: {
> +		def->key_hint = is_nullable ? key_hint_scalar<true, true> :
> +					      key_hint_scalar<true, false>;
> +		def->tuple_hint = is_nullable ? tuple_hint_scalar<true> :
> +					        tuple_hint_scalar<false>;
> +		break;
> +	}
> +	default: {
> +		break;
> +	}
> +	}
> +}
> +
>  /* {{{ tuple_compare */
>  
>  /*
> @@ -792,9 +1210,13 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
>  
>  template <bool is_nullable, bool has_optional_parts>
>  static int
> -tuple_compare_sequential(const struct tuple *tuple_a,
> -			 const struct tuple *tuple_b, key_def *key_def)
> +tuple_compare_sequential_hinted(const struct tuple *tuple_a,
> +		uint64_t tuple_a_hint, const struct tuple *tuple_b,
> +		uint64_t tuple_b_hint, key_def *key_def)
>  {
> +	int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
> +	if (rc != 0)
> +		return rc;

This should go after all the assertions below.

>  	assert(!has_optional_parts || is_nullable);
>  	assert(has_optional_parts == key_def->has_optional_parts);
>  	assert(key_def_is_sequential(key_def));
> diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
> index afc15e809..3bd7c44f5 100644
> --- a/src/lib/coll/coll.c
> +++ b/src/lib/coll/coll.c
> @@ -133,6 +133,37 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
>  	return s_len;
>  }
>  
> +/** Get a compare hint of a binary collation. */
> +static uint64_t
> +coll_bin_hint(const char *s, size_t s_len, struct coll *coll)
> +{
> +	(void)coll;
> +	uint64_t result = 0;
> +	uint32_t process_len = MIN(s_len, 7);
> +	for (uint32_t i = 0; i < process_len; i++) {
> +		result <<= 8;
> +		result |= ((unsigned char *)s)[i];
> +	}
> +	result <<= 8 * (7 - process_len);

I don't like that you copy-n-paste string hint computation logic here
and in tuple_compare.cc. I think that it should be consolidated in
tuple_compare.cc while coll->hint should return a hint in some raw
format. I guess it could be just a wrapper around ucol_nextSortKeyPart.
May be, we should even pick another name for it.

> +	return result;
> +}
> +
> +/** Get a compare hint of a string using ICU collation. */
> +static uint64_t
> +coll_icu_hint(const char *s, size_t s_len, struct coll *coll)
> +{
> +	assert(coll->type == COLL_TYPE_ICU);
> +	UCharIterator itr;
> +	uiter_setUTF8(&itr, s, s_len);
> +	uint8_t buf[7];
> +	uint32_t state[2] = {0, 0};
> +	UErrorCode status = U_ZERO_ERROR;
> +	int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state, buf,
> +					   sizeof(buf), &status);
> +	assert(got >= 0 && got <= 7);
> +	return coll_bin_hint((const char *)buf, got, NULL);
> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tarantool-patches] Re: [PATCH v6 2/3] memtx: introduce tuple compare hint
  2019-03-14  8:19   ` Vladimir Davydov
@ 2019-03-15 10:20     ` Kirill Shcherbatov
  2019-03-20 18:08       ` Vladimir Davydov
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-15 10:20 UTC (permalink / raw)
  To: tarantool-patches, Vladimir Davydov

> Please refresh the comment. It isn't quite correct.
I've stripped it a bit.

> On the second thought, we might want to reduce the hint size or increase
> it in future. Would it be better to introduce a helper type for hints so
> that we could easily do this if we need to?
> 
>   typedef uint64_t tuple_hint_t;
> 
> What do you think? Would it be worthwhile? Would it look OK?
> 
> Sorry, I asked you to use uint64_t before. Looks to me now that
> I was wrong... :-(
tuple_hint_t name is not really good, I believe.
At first, tuple_hint() is a routine that calculates such hint for tuple.
I prefer hint_t type that would be ok even later with multikey index intriduction.

> Hmm, (&a)->tuple is equivalent to (a).tuple, isn't it?
I don't understand why, by it doesn't work in such way.


> Accessing index->tree.arg looks ugly - it violates encapsulation.
> Besides, right above you use index_def_cmp_def instead. I think we
> better introduce a helper function for accessing index->tree.arg
> and use it everywhere:
> 
> 	memtx_tree_cmp_def(&index->tree)
Now I use memtx_tree_index_cmp_def() routine that already present in code
and works fine in such scenario.

==============================================================

Implement functions for retrieving tuple hints for a particular
key_def. Hint is an integer that can be used for tuple comparison
optimization: if a hint of one tuple is less than a hint of another
then the first tuple is definitely less than the second; only if
hints are equal tuple_compare must be called for getting comparison
result. Hints are calculated using only the first part of key_def.

Close #3961
---
 src/box/key_def.h        | 128 +++++++++++++
 src/box/memtx_tree.c     |  40 +++-
 src/box/tuple_compare.cc | 394 +++++++++++++++++++++++++++++++++++++--
 src/lib/coll/coll.c      |  29 +++
 src/lib/coll/coll.h      |   5 +
 test/engine/ddl.result   | 180 ++++++++++++++++++
 test/engine/ddl.test.lua |  49 +++++
 7 files changed, 800 insertions(+), 25 deletions(-)

diff --git a/src/box/key_def.h b/src/box/key_def.h
index 7c9f8c6b5..174790f72 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -116,6 +116,41 @@ struct key_part {
 struct key_def;
 struct tuple;
 
+/**
+ * Tuple comparison hint h(t) is such a function of tuple t that
+ * the following conditions always hold for any pair of tuples
+ * t1 and t2:
+ * if h(t1) < h(t2) then t1 < t2;
+ * if h(t1) > h(t2) then t1 > t2;
+ * if h(t1) == h(t2) then t1 may or may not be equal to t2.
+ * These rules mean that instead of direct tuple vs tuple
+ * (or tuple vs key) comparison one may compare theirs
+ * hints first; and only if theirs hints equal compare
+ * the tuples themselves.
+ *
+ * The comparison hint has the following structure:
+ * uint64_t: [ HINT_TYPE | HINT_VALUE ]
+ *           64          60           0 bit
+ *
+ * To make the values of hints for numeric types(unsigned,
+ * integer, float, double) compatible, define a linear mapping
+ * of the segment f(x) = k*x + b = x + max + 1:
+ * f([-max - 1, max]) = [0, 2 * max + 1].
+ * It is easy to make sure that there is a < b <=> f (a) < f(b).
+ * The max value is determined by the number of bits allocated
+ * for storage of HINT_VALUE.
+ */
+typedef uint64_t hint_t;
+
+#define HINT_VALUE_BITS 60
+#define HINT_MAX (((int64_t)1 << (HINT_VALUE_BITS - 1)) - 1)
+#define HINT_MIN (-HINT_MAX - 1)
+
+/**
+ * The reserved value to use when comparison hint is undefined.
+ */
+#define HINT_NONE ((hint_t)UINT64_MAX)
+
 /**
  * Get is_nullable property of key_part.
  * @param key_part for which attribute is being fetched
@@ -137,6 +172,19 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
 typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
 			       const struct tuple *tuple_b,
 			       struct key_def *key_def);
+/** @copydoc tuple_compare_with_key_hinted() */
+typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
+					       hint_t tuple_hint,
+					       const char *key,
+					       uint32_t part_count,
+					       hint_t key_hint,
+					       struct key_def *key_def);
+/** @copydoc tuple_compare_hinted() */
+typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a,
+				      hint_t tuple_a_hint,
+				      const struct tuple *tuple_b,
+				      hint_t tuple_b_hint,
+				      struct key_def *key_def);
 /** @copydoc tuple_extract_key() */
 typedef char *(*tuple_extract_key_t)(const struct tuple *tuple,
 				     struct key_def *key_def,
@@ -152,6 +200,11 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple,
 /** @copydoc key_hash() */
 typedef uint32_t (*key_hash_t)(const char *key,
 				struct key_def *key_def);
+/** @copydoc tuple_hint() */
+typedef hint_t (*tuple_hint_t)(const struct tuple *tuple,
+			       struct key_def *key_def);
+/** @copydoc key_hint() */
+typedef hint_t (*key_hint_t)(const char *key, struct key_def *key_def);
 
 /* Definition of a multipart key. */
 struct key_def {
@@ -159,6 +212,10 @@ struct key_def {
 	tuple_compare_t tuple_compare;
 	/** @see tuple_compare_with_key() */
 	tuple_compare_with_key_t tuple_compare_with_key;
+	/** @see tuple_compare_with_key_hinted() */
+	tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
+	/** @see tuple_compare_hinted() */
+	tuple_compare_hinted_t tuple_compare_hinted;
 	/** @see tuple_extract_key() */
 	tuple_extract_key_t tuple_extract_key;
 	/** @see tuple_extract_key_raw() */
@@ -167,6 +224,10 @@ struct key_def {
 	tuple_hash_t tuple_hash;
 	/** @see key_hash() */
 	key_hash_t key_hash;
+	/** @see tuple_hint() */
+	tuple_hint_t tuple_hint;
+	/** @see key_hint() */
+	key_hint_t key_hint;
 	/**
 	 * Minimal part count which always is unique. For example,
 	 * if a secondary index is unique, then
@@ -575,6 +636,49 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
 	return key_def->tuple_compare_with_key(tuple, key, part_count, key_def);
 }
 
+/**
+ * Compare tuples using the key definition and comparison hints.
+ * @param tuple_a First tuple.
+ * @param tuple_a_hint Comparison hint of @a tuple_a.
+ * @param tuple_b Second tuple.
+ * @param tuple_b_hint Comparison hint of @a tuple_b.
+ * @param key_def Key definition.
+ * @retval 0  if key_fields(tuple_a) == key_fields(tuple_b).
+ * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b).
+ * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b).
+ */
+static inline int
+tuple_compare_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+		     const struct tuple *tuple_b, hint_t tuple_b_hint,
+		     struct key_def *key_def)
+{
+	return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
+					     tuple_b_hint, key_def);
+}
+
+/**
+ * Compare tuple with key using the key definition and
+ * comparison hints.
+ * @param tuple Tuple.
+ * @param tuple_hint Comparison hint of @a tuple.
+ * @param key Key parts without MessagePack array header.
+ * @param part_count The number of parts in @a key.
+ * @param key_hint t Comparison hint of @a key.
+ * @param key_def Key definition.
+ * @retval 0  if key_fields(tuple) == parts(key).
+ * @retval <0 if key_fields(tuple) < parts(key).
+ * @retval >0 if key_fields(tuple) > parts(key).
+ */
+static inline int
+tuple_compare_with_key_hinted(const struct tuple *tuple, hint_t tuple_hint,
+			      const char *key, uint32_t part_count,
+			      hint_t key_hint, struct key_def *key_def)
+{
+	return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
+						      part_count, key_hint,
+						      key_def);
+}
+
 /**
  * Compute hash of a tuple field.
  * @param ph1 - pointer to running hash
@@ -628,6 +732,30 @@ key_hash(const char *key, struct key_def *key_def)
 	return key_def->key_hash(key, key_def);
 }
 
+ /*
+ * Get comparison hint for a @a tuple.
+ * @param tuple Tuple to compute the hint for.
+ * @param key_def Key definition used for tuple comparison.
+ * @return The hint.
+ */
+static inline hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+	return key_def->tuple_hint(tuple, key_def);
+}
+
+/**
+ * Get a comparison hint of a @a key.
+ * @param key Key to compute the hint for.
+ * @param key_def Key definition used for tuple comparison.
+ * @return The hint.
+ */
+static inline hint_t
+key_hint(const char *key, struct key_def *key_def)
+{
+	return key_def->key_hint(key, key_def);
+}
+
 #if defined(__cplusplus)
 } /* extern "C" */
 #endif /* defined(__cplusplus) */
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index d5b42efb6..c2c8a19e7 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -47,6 +47,8 @@ struct memtx_tree_key_data {
 	const char *key;
 	/** Number of msgpacked search fields. */
 	uint32_t part_count;
+	/** Comparison hint, see tuple_hint(). */
+	hint_t hint;
 };
 
 /**
@@ -55,6 +57,8 @@ struct memtx_tree_key_data {
 struct memtx_tree_data {
 	/* Tuple that this node is represents. */
 	struct tuple *tuple;
+	/** Comparison hint, see key_hint(). */
+	hint_t hint;
 };
 
 /**
@@ -69,16 +73,18 @@ static bool
 memtx_tree_data_identical(const struct memtx_tree_data *a,
 			  const struct memtx_tree_data *b)
 {
-	return a->tuple == b->tuple;
+	return a->tuple == b->tuple && a->hint == b->hint;
 }
 
 #define BPS_TREE_NAME memtx_tree
 #define BPS_TREE_BLOCK_SIZE (512)
 #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
 #define BPS_TREE_COMPARE(a, b, arg)\
-	tuple_compare((&a)->tuple, (&b)->tuple, arg)
+	tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
+			     (&b)->hint, arg)
 #define BPS_TREE_COMPARE_KEY(a, b, arg)\
-	tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg)
+	tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\
+				      (b)->part_count, (b)->hint, arg)
 #define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b)
 #define bps_tree_elem_t struct memtx_tree_data
 #define bps_tree_key_t struct memtx_tree_key_data *
@@ -113,7 +119,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c)
 	const struct memtx_tree_data *data_a = a;
 	const struct memtx_tree_data *data_b = b;
 	struct key_def *key_def = c;
-	return tuple_compare(data_a->tuple, data_b->tuple, key_def);
+	return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple,
+				    data_b->hint, key_def);
 }
 
 /* {{{ MemtxTree Iterators ****************************************/
@@ -235,9 +242,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret)
 		memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
 	/* Use user key def to save a few loops. */
 	if (res == NULL ||
-	    tuple_compare_with_key(res->tuple, it->key_data.key,
-				   it->key_data.part_count,
-				   it->index_def->key_def) != 0) {
+	    tuple_compare_with_key_hinted(res->tuple, res->hint,
+					  it->key_data.key,
+					  it->key_data.part_count,
+					  it->key_data.hint,
+					  it->index_def->key_def) != 0) {
 		iterator->next = tree_iterator_dummie;
 		it->current.tuple = NULL;
 		*ret = NULL;
@@ -266,9 +275,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret)
 		memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
 	/* Use user key def to save a few loops. */
 	if (res == NULL ||
-	    tuple_compare_with_key(res->tuple, it->key_data.key,
-				   it->key_data.part_count,
-				   it->index_def->key_def) != 0) {
+	    tuple_compare_with_key_hinted(res->tuple, res->hint,
+					  it->key_data.key,
+					  it->key_data.part_count,
+					  it->key_data.hint,
+					  it->index_def->key_def) != 0) {
 		iterator->next = tree_iterator_dummie;
 		it->current.tuple = NULL;
 		*ret = NULL;
@@ -516,9 +527,11 @@ memtx_tree_index_get(struct index *base, const char *key,
 	assert(base->def->opts.is_unique &&
 	       part_count == base->def->key_def->part_count);
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
 	struct memtx_tree_key_data key_data;
 	key_data.key = key;
 	key_data.part_count = part_count;
+	key_data.hint = key_hint(key, cmp_def);
 	struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
 	*result = res != NULL ? res->tuple : NULL;
 	return 0;
@@ -530,9 +543,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
 			 struct tuple **result)
 {
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *key_def = memtx_tree_index_cmp_def(index);
 	if (new_tuple) {
 		struct memtx_tree_data new_data;
 		new_data.tuple = new_tuple;
+		new_data.hint = tuple_hint(new_tuple, key_def);
 		struct memtx_tree_data dup_data;
 		dup_data.tuple = NULL;
 
@@ -565,6 +580,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
 	if (old_tuple) {
 		struct memtx_tree_data old_data;
 		old_data.tuple = old_tuple;
+		old_data.hint = tuple_hint(old_tuple, key_def);
 		memtx_tree_delete(&index->tree, old_data);
 	}
 	*result = old_tuple;
@@ -607,6 +623,8 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
 	it->type = type;
 	it->key_data.key = key;
 	it->key_data.part_count = part_count;
+	struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
+	it->key_data.hint = key_hint(key, cmp_def);
 	it->index_def = base->def;
 	it->tree = &index->tree;
 	it->tree_iterator = memtx_tree_invalid_iterator();
@@ -671,6 +689,8 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
 	struct memtx_tree_data *elem =
 		&index->build_array[index->build_array_size++];
 	elem->tuple = tuple;
+	struct key_def *key_def = memtx_tree_index_cmp_def(index);
+	elem->hint = tuple_hint(tuple, key_def);
 	return 0;
 }
 
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 7fe1766a8..54a977926 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -34,6 +34,14 @@
 #include "trivia/util.h" /* NOINLINE */
 #include <math.h>
 
+/**
+ * Perform hints comparison according hint_t definition.
+ * Returns 1 when hint_a > hint_b, -1 when hint_b > hint_a;
+ * When tuple_hints are incompatible or equal, return 0.
+ */
+static int
+hint_cmp(hint_t hint_a, hint_t hint_b);
+
 /* {{{ tuple_compare */
 
 /*
@@ -427,13 +435,17 @@ tuple_compare_field_with_hint(const char *field_a, enum mp_type a_type,
 
 template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
-tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
-		       struct key_def *key_def)
+tuple_compare_slowpath_hinted(const struct tuple *tuple_a,
+		hint_t tuple_a_hint, const struct tuple *tuple_b,
+		hint_t tuple_b_hint, struct key_def *key_def)
 {
 	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
+	int rc = 0;
+	if ((rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0)
+		return rc;
 	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
@@ -469,7 +481,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	struct key_part *end;
 	const char *field_a, *field_b;
 	enum mp_type a_type, b_type;
-	int rc;
 	if (is_nullable)
 		end = part + key_def->unique_part_count;
 	else
@@ -559,8 +570,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 
 template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
-tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
-				uint32_t part_count, struct key_def *key_def)
+tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
+		       struct key_def *key_def)
+{
+	return tuple_compare_slowpath_hinted<is_nullable, has_optional_parts,
+			has_json_paths>(tuple_a, HINT_NONE, tuple_b, HINT_NONE,
+					key_def);
+}
+
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
+		hint_t tuple_hint, const char *key, uint32_t part_count,
+		hint_t key_hint, struct key_def *key_def)
 {
 	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
@@ -568,6 +590,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key != NULL || part_count == 0);
 	assert(part_count <= key_def->part_count);
+	int rc = 0;
+	if ((rc = hint_cmp(tuple_hint, key_hint)) != 0)
+		return rc;
 	struct key_part *part = key_def->parts;
 	struct tuple_format *format = tuple_format(tuple);
 	const char *tuple_raw = tuple_data(tuple);
@@ -603,7 +628,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	}
 
 	struct key_part *end = part + part_count;
-	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
 		const char *field;
 		if (has_json_paths) {
@@ -642,6 +666,17 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	return 0;
 }
 
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
+				uint32_t part_count, struct key_def *key_def)
+{
+	return tuple_compare_with_key_slowpath_hinted<is_nullable,
+			has_optional_parts, has_json_paths>(tuple,
+					HINT_NONE, key, part_count,
+					HINT_NONE, key_def);
+}
+
 template<bool is_nullable>
 static inline int
 key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
@@ -700,13 +735,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
 
 template<bool is_nullable, bool has_optional_parts>
 static inline int
-tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
-				  uint32_t part_count, struct key_def *key_def)
+tuple_compare_with_key_sequential_hinted(const struct tuple *tuple,
+		hint_t tuple_hint, const char *key, uint32_t part_count,
+		hint_t key_hint, struct key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(key_def_is_sequential(key_def));
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
+	int rc = hint_cmp(tuple_hint, key_hint);
+	if (rc != 0)
+		return rc;
 	const char *tuple_key = tuple_data(tuple);
 	uint32_t field_count = mp_decode_array(&tuple_key);
 	uint32_t cmp_part_count;
@@ -716,8 +755,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
 		assert(field_count >= part_count);
 		cmp_part_count = part_count;
 	}
-	int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
-						key_def);
+	rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
+					    key_def);
 	if (!has_optional_parts || rc != 0)
 		return rc;
 	/*
@@ -739,6 +778,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
 	return 0;
 }
 
+template<bool is_nullable, bool has_optional_parts>
+static inline int
+tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
+				  uint32_t part_count, struct key_def *key_def)
+{
+	return tuple_compare_with_key_sequential_hinted<is_nullable,
+			has_optional_parts>(tuple, HINT_NONE, key,
+					    part_count, HINT_NONE, key_def);
+}
+
 int
 key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
 {
@@ -759,13 +808,17 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
 
 template <bool is_nullable, bool has_optional_parts>
 static int
-tuple_compare_sequential(const struct tuple *tuple_a,
-			 const struct tuple *tuple_b, key_def *key_def)
+tuple_compare_sequential_hinted(const struct tuple *tuple_a,
+		hint_t tuple_a_hint, const struct tuple *tuple_b,
+		hint_t tuple_b_hint, key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key_def_is_sequential(key_def));
 	assert(is_nullable == key_def->is_nullable);
+	int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+	if (rc != 0)
+		return rc;
 	const char *key_a = tuple_data(tuple_a);
 	uint32_t fc_a = mp_decode_array(&key_a);
 	const char *key_b = tuple_data(tuple_b);
@@ -779,7 +832,6 @@ tuple_compare_sequential(const struct tuple *tuple_a,
 	bool was_null_met = false;
 	struct key_part *part = key_def->parts;
 	struct key_part *end = part + key_def->unique_part_count;
-	int rc;
 	uint32_t i = 0;
 	for (; part < end; ++part, ++i) {
 		enum mp_type a_type, b_type;
@@ -826,6 +878,16 @@ tuple_compare_sequential(const struct tuple *tuple_a,
 	return 0;
 }
 
+template <bool is_nullable, bool has_optional_parts>
+static int
+tuple_compare_sequential(const struct tuple *tuple_a,
+			 const struct tuple *tuple_b, key_def *key_def)
+{
+	return tuple_compare_sequential_hinted<is_nullable,
+			has_optional_parts>(tuple_a, HINT_NONE, tuple_b,
+					    HINT_NONE, key_def);
+}
+
 template <int TYPE>
 static inline int
 field_compare(const char **field_a, const char **field_b);
@@ -956,6 +1018,18 @@ struct TupleCompare
 			compare(tuple_a, tuple_b, format_a,
 				format_b, field_a, field_b);
 	}
+
+	static int compare_hinted(const struct tuple *tuple_a,
+				  hint_t tuple_a_hint,
+				  const struct tuple *tuple_b,
+				  hint_t tuple_b_hint,
+				  struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple_a, tuple_b, key_def);
+	}
 };
 
 template <int TYPE, int ...MORE_TYPES>
@@ -973,15 +1047,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> {
 		return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b,
 					format_a, format_b, field_a, field_b);
 	}
+
+	static int compare_hinted(const struct tuple *tuple_a,
+				  hint_t tuple_a_hint,
+				  const struct tuple *tuple_b,
+				  hint_t tuple_b_hint,
+				  struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple_a, tuple_b, key_def);
+	}
 };
 } /* end of anonymous namespace */
 
 struct comparator_signature {
 	tuple_compare_t f;
+	tuple_compare_hinted_t f_hinted;
 	uint32_t p[64];
 };
 #define COMPARATOR(...) \
-	{ TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } },
+	{ TupleCompare<__VA_ARGS__>::compare, \
+	  TupleCompare<__VA_ARGS__>::compare_hinted, \
+	  { __VA_ARGS__, UINT32_MAX } },
 
 /**
  * field1 no, field1 type, field2 no, field2 type, ...
@@ -1133,6 +1222,17 @@ struct TupleCompareWithKey
 				compare(tuple, key, part_count,
 					key_def, format, field);
 	}
+
+	static int
+	compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+		       const char *key, uint32_t part_count, hint_t key_hint,
+		       struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_hint, key_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple, key, part_count, key_def);
+	}
 };
 
 template <int TYPE, int ...MORE_TYPES>
@@ -1153,6 +1253,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
 			compare(tuple, key, part_count,
 				key_def, format, field);
 	}
+
+	static int
+	compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+		       const char *key, uint32_t part_count, hint_t key_hint,
+		       struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_hint, key_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple, key, part_count, key_def);
+	}
 };
 
 } /* end of anonymous namespace */
@@ -1160,11 +1271,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
 struct comparator_with_key_signature
 {
 	tuple_compare_with_key_t f;
+	tuple_compare_with_key_hinted_t f_hinted;
 	uint32_t p[64];
 };
 
 #define KEY_COMPARATOR(...) \
-	{ TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
+	{ TupleCompareWithKey<0, __VA_ARGS__>::compare, \
+	  TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \
+	  { __VA_ARGS__ } },
 
 static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
 	KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
@@ -1187,16 +1301,22 @@ static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
 #define COMPARE_SLOWPATH(...) \
 	{tuple_compare_slowpath<__VA_ARGS__>, \
 	 tuple_compare_with_key_slowpath<__VA_ARGS__>, \
+	 tuple_compare_slowpath_hinted<__VA_ARGS__>, \
+	 tuple_compare_with_key_slowpath_hinted<__VA_ARGS__>, \
 	 __VA_ARGS__, false}
 
 #define COMPARE_SEQUENTIAL(...) \
 	{tuple_compare_sequential<__VA_ARGS__>, \
 	 tuple_compare_with_key_sequential<__VA_ARGS__>,  \
+	 tuple_compare_sequential_hinted<__VA_ARGS__>, \
+	 tuple_compare_with_key_sequential_hinted<__VA_ARGS__>, \
 	 __VA_ARGS__, false, true}
 
 static struct  {
 	tuple_compare_t tuple_compare;
 	tuple_compare_with_key_t tuple_compare_with_key;
+	tuple_compare_hinted_t tuple_compare_hinted;
+	tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
 	bool is_nullable;
 	bool has_optional_parts;
 	bool has_json_paths;
@@ -1221,11 +1341,236 @@ static struct  {
 
 /* }}} tuple_compare_with_key */
 
+static inline hint_t
+tuple_hint_new(enum mp_class type, uint64_t val)
+{
+	assert(((uint64_t)type >> (sizeof(hint_t) * CHAR_BIT - HINT_VALUE_BITS)) == 0);
+	assert(val >> HINT_VALUE_BITS == 0);
+	return (hint_t)(((uint64_t)type << HINT_VALUE_BITS) | val);
+}
+
+static int
+hint_cmp(hint_t hint_a, hint_t hint_b)
+{
+	if (hint_a != HINT_NONE && hint_b != HINT_NONE && hint_a != hint_b)
+		return hint_a < hint_b ? -1 : 1;
+	else
+		return 0;
+}
+
+static hint_t
+field_hint_unsigned(const char *field)
+{
+	assert(mp_typeof(*field) == MP_UINT);
+	uint64_t val = mp_decode_uint(&field);
+	if (val > HINT_MAX)
+		val = HINT_MAX;
+	uint64_t ret = val - (uint64_t)HINT_MIN;
+	return tuple_hint_new(MP_CLASS_NUMBER, ret);
+}
+
+static hint_t
+field_hint_integer(const char *field)
+{
+	switch (mp_typeof(*field)) {
+	case MP_INT: {
+		int64_t val = mp_decode_int(&field);
+		if (val > HINT_MAX)
+			val = HINT_MAX;
+		else if (val < HINT_MIN)
+			val = HINT_MIN;
+		uint64_t ret = (uint64_t)val - (uint64_t)HINT_MIN;
+		return tuple_hint_new(MP_CLASS_NUMBER, ret);
+	}
+	case MP_UINT:
+		return field_hint_unsigned(field);
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+static hint_t
+field_hint_number(const char *field)
+{
+	enum mp_type type = mp_typeof(*field);
+	switch (type) {
+	case MP_FLOAT:
+	case MP_DOUBLE: {
+		double f = type == MP_FLOAT ?
+			   mp_decode_float(&field) : mp_decode_double(&field);
+		if (!isfinite(f))
+			return HINT_NONE;
+		uint64_t ret;
+		if (f > exp2(HINT_VALUE_BITS - 1))
+			ret = (uint64_t)HINT_MAX - (uint64_t)HINT_MIN;
+		else if (f < -exp2(HINT_VALUE_BITS - 1))
+			ret = 0;
+		else {
+			double val;
+			(void)modf(f, &val);
+			ret = (uint64_t)val - (uint64_t)HINT_MIN;
+		}
+		return tuple_hint_new(MP_CLASS_NUMBER, ret);
+	}
+	case MP_UINT:
+		return field_hint_unsigned(field);
+	case MP_INT:
+		return field_hint_integer(field);
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+static hint_t
+field_hint_boolean(const char *field)
+{
+	assert(mp_typeof(*field) == MP_BOOL);
+	return tuple_hint_new(MP_CLASS_BOOL, mp_decode_bool(&field) ? 1 : 0);
+}
+
+static uint64_t
+memory_buffer_hint_val(const unsigned char *raw, uint32_t size)
+{
+	int process_len = MIN((int)size, 7);
+	uint64_t result = 0;
+	for (int i = 0; i < process_len; i++) {
+		result <<= CHAR_BIT;
+		result |= raw[i];
+	}
+	result <<= CHAR_BIT * (7 - process_len);
+	return result;
+}
+
+template <bool has_collation>
+static hint_t
+field_hint_string(const char *field, struct coll *coll)
+{
+	assert(has_collation == (coll != NULL));
+	assert(mp_typeof(*field) == MP_STR);
+	char buff[7];
+	uint32_t len;
+	const char *str = mp_decode_str(&field, &len);
+	if (has_collation) {
+		len = coll->get_sort_key(str, len, buff, sizeof(buff), coll);
+		str = buff;
+	}
+	uint64_t ret = memory_buffer_hint_val((const unsigned char *)str, len);
+	return tuple_hint_new(MP_CLASS_STR, ret);
+}
+
+template <bool has_collation>
+static hint_t
+field_hint_scalar(const char *field, struct coll *coll)
+{
+	switch (mp_classof(mp_typeof(*field))) {
+	case MP_CLASS_NUMBER:
+		return field_hint_number(field);
+	case MP_CLASS_BOOL:
+		return field_hint_boolean(field);
+	case MP_CLASS_STR:
+		return field_hint_string<has_collation>(field, coll);
+	case MP_CLASS_BIN: {
+		assert(coll == NULL);
+		uint32_t size;
+		const unsigned char *raw =
+			(const unsigned char *)mp_decode_bin(&field, &size);
+		uint64_t ret = memory_buffer_hint_val(raw, size);
+		return tuple_hint_new(MP_CLASS_BIN, ret);
+	}
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable, bool has_collation>
+static hint_t
+field_hint(const char *field, struct key_def *key_def)
+{
+	assert(type == key_def->parts->type);
+	assert(key_part_is_nullable(key_def->parts) == is_nullable);
+	assert(has_collation == (key_def->parts->coll != NULL));
+	switch (type) {
+	case FIELD_TYPE_UNSIGNED:
+		return field_hint_unsigned(field);
+	case FIELD_TYPE_INTEGER:
+		return field_hint_integer(field);
+	case FIELD_TYPE_NUMBER:
+		return field_hint_number(field);
+	case FIELD_TYPE_BOOLEAN:
+		return field_hint_boolean(field);
+	case FIELD_TYPE_STRING:
+		return field_hint_string<has_collation>(field,
+							key_def->parts->coll);
+	case FIELD_TYPE_SCALAR:
+		return field_hint_scalar<has_collation>(field,
+							key_def->parts->coll);
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable, bool has_collation>
+static hint_t
+key_hint(const char *key, struct key_def *key_def)
+{
+	if (key == NULL || (is_nullable && mp_typeof(*key) == MP_NIL))
+		return tuple_hint_new(MP_CLASS_NIL, 0);
+	return field_hint<type, is_nullable, has_collation>(key, key_def);
+}
+
+template <enum field_type type, bool is_nullable, bool has_collation>
+static hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && (field == NULL || mp_typeof(*field) == MP_NIL))
+		return tuple_hint_new(MP_CLASS_NIL, 0);
+	return field_hint<type, is_nullable, has_collation>(field, key_def);
+}
+
+#define HINT(...) { \
+	tuple_hint<__VA_ARGS__>, key_hint<__VA_ARGS__>, __VA_ARGS__ }
+
+static const struct {
+	tuple_hint_t tuple_hint;
+	key_hint_t key_hint;
+	enum field_type type;
+	bool is_nullable;
+	bool has_collation;
+} hint_arr[] = {
+	HINT(FIELD_TYPE_UNSIGNED, false, false),
+	HINT(FIELD_TYPE_STRING, false, false),
+	HINT(FIELD_TYPE_NUMBER, false, false),
+	HINT(FIELD_TYPE_INTEGER, false, false),
+	HINT(FIELD_TYPE_BOOLEAN, false, false),
+	HINT(FIELD_TYPE_SCALAR, false, false),
+	HINT(FIELD_TYPE_UNSIGNED, true, false),
+	HINT(FIELD_TYPE_STRING, true, false),
+	HINT(FIELD_TYPE_NUMBER, true, false),
+	HINT(FIELD_TYPE_INTEGER, true, false),
+	HINT(FIELD_TYPE_BOOLEAN, true, false),
+	HINT(FIELD_TYPE_SCALAR, true, false),
+	HINT(FIELD_TYPE_STRING, false, true),
+	HINT(FIELD_TYPE_SCALAR, false, true),
+	HINT(FIELD_TYPE_STRING, true, true),
+	HINT(FIELD_TYPE_SCALAR, true, true)
+};
+
+#undef HINT
+
 void
 key_def_set_compare_func(struct key_def *def)
 {
 	def->tuple_compare = NULL;
 	def->tuple_compare_with_key = NULL;
+	def->tuple_compare_hinted = NULL;
+	def->tuple_compare_with_key_hinted = NULL;
+	def->key_hint = NULL;
+	def->tuple_hint = NULL;
 	if (!def->is_nullable && !key_def_has_collation(def) &&
 	    !def->has_json_paths) {
 		/* Precalculated comparators don't use collation */
@@ -1242,6 +1587,8 @@ key_def_set_compare_func(struct key_def *def)
 			if (i == def->part_count &&
 			    precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) {
 				def->tuple_compare = precompiled_cmp_arr[k].f;
+				def->tuple_compare_hinted =
+					precompiled_cmp_arr[k].f_hinted;
 				break;
 			}
 		}
@@ -1258,6 +1605,8 @@ key_def_set_compare_func(struct key_def *def)
 			if (i == def->part_count) {
 				def->tuple_compare_with_key =
 					precompiled_cmp_wk_arr[k].f;
+				def->tuple_compare_with_key_hinted =
+					precompiled_cmp_wk_arr[k].f_hinted;
 				break;
 			}
 		}
@@ -1274,11 +1623,16 @@ key_def_set_compare_func(struct key_def *def)
 				if (def->tuple_compare == NULL) {
 					def->tuple_compare =
 						cmp_arr[k].tuple_compare;
+					def->tuple_compare_hinted =
+						cmp_arr[k].tuple_compare_hinted;
 				}
 				if (def->tuple_compare_with_key == NULL) {
 					def->tuple_compare_with_key =
 						cmp_arr[k].
 						tuple_compare_with_key;
+					def->tuple_compare_with_key_hinted =
+						cmp_arr[k].
+						tuple_compare_with_key_hinted;
 				}
 				break;
 			}
@@ -1286,4 +1640,14 @@ key_def_set_compare_func(struct key_def *def)
 	}
 	assert(def->tuple_compare != NULL &&
 	       def->tuple_compare_with_key != NULL);
+	for (uint32_t k = 0; k < sizeof(hint_arr) / sizeof(hint_arr[0]); k++) {
+		if (def->parts->type == hint_arr[k].type &&
+		    key_part_is_nullable(def->parts) ==
+		    hint_arr[k].is_nullable &&
+		    (def->parts->coll != NULL) == hint_arr[k].has_collation) {
+			def->key_hint = hint_arr[k].key_hint;
+			def->tuple_hint = hint_arr[k].tuple_hint;
+			break;
+		}
+	}
 }
diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
index afc15e809..d110b156b 100644
--- a/src/lib/coll/coll.c
+++ b/src/lib/coll/coll.c
@@ -133,6 +133,33 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
 	return s_len;
 }
 
+int
+coll_icu_get_sort_key(const char *s, size_t s_len, char *buff, size_t buff_len,
+		      struct coll *coll)
+{
+	assert(coll->type == COLL_TYPE_ICU);
+	UCharIterator itr;
+	uiter_setUTF8(&itr, s, s_len);
+	uint32_t state[2] = {0, 0};
+	UErrorCode status = U_ZERO_ERROR;
+	int32_t got = ucol_nextSortKeyPart(coll->collator, &itr, state,
+					   (uint8_t *)buff, buff_len, &status);
+	assert(got >= 0 && (size_t)got <= buff_len);
+	assert(U_SUCCESS(status));
+	return got;
+}
+
+int
+coll_bin_get_sort_key(const char *s, size_t s_len, char *buff, size_t buff_len,
+		      struct coll *coll)
+{
+	(void)coll;
+	assert(coll->type == COLL_TYPE_BINARY);
+	int to_copy = (int)MIN(s_len, buff_len);
+	memcpy(buff, s, to_copy);
+	return to_copy;
+}
+
 /**
  * Set up ICU collator and init cmp and hash members of collation.
  * @param coll Collation to set up.
@@ -242,6 +269,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
 	}
 	coll->cmp = coll_icu_cmp;
 	coll->hash = coll_icu_hash;
+	coll->get_sort_key = coll_icu_get_sort_key;
 	return 0;
 }
 
@@ -356,6 +384,7 @@ coll_new(const struct coll_def *def)
 		coll->collator = NULL;
 		coll->cmp = coll_bin_cmp;
 		coll->hash = coll_bin_hash;
+		coll->get_sort_key = coll_bin_get_sort_key;
 		break;
 	default:
 		unreachable();
diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h
index 8c9f94293..a68668ec5 100644
--- a/src/lib/coll/coll.h
+++ b/src/lib/coll/coll.h
@@ -48,6 +48,9 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
 typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
 				uint32_t *pcarry, struct coll *coll);
 
+typedef int (*coll_get_sort_key_f)(const char *s, size_t s_len, char *buff,
+				   size_t buff_len, struct coll *coll);
+
 struct UCollator;
 
 /**
@@ -62,6 +65,8 @@ struct coll {
 	/** String comparator. */
 	coll_cmp_f cmp;
 	coll_hash_f hash;
+	/** Fill memory buffer with data. */
+	coll_get_sort_key_f get_sort_key;
 	/** Reference counter. */
 	int refs;
 	/**
diff --git a/test/engine/ddl.result b/test/engine/ddl.result
index 8d34d5ef4..1791138c6 100644
--- a/test/engine/ddl.result
+++ b/test/engine/ddl.result
@@ -1935,6 +1935,186 @@ s:drop()
 ---
 ...
 --
+-- gh-3961: Introduce tuple comparison hints
+--
+s = box.schema.space.create('test', {engine = engine})
+---
+...
+pk = s:create_index('pk')
+---
+...
+i1 = s:create_index('i1', {unique = true,  parts = {2, 'scalar'}})
+---
+...
+bdbl = (2.1)^62;
+---
+...
+nbdbl = -bdbl
+---
+...
+_ = s:insert{0, bdbl}
+---
+...
+_ = s:insert{1, nbdbl}
+---
+...
+_ = s:insert{2, -2.3}
+---
+...
+_ = s:insert{3, -2}
+---
+...
+_ = s:insert{4, -1.3}
+---
+...
+_ = s:insert{5, -1}
+---
+...
+_ = s:insert{6, 0}
+---
+...
+_ = s:insert{7, 1}
+---
+...
+_ = s:insert{8, 1.3}
+---
+...
+_ = s:insert{9, 2}
+---
+...
+_ = s:insert{10, 2.3}
+---
+...
+i1:select()
+---
+- - [1, -9.4972150816945e+19]
+  - [2, -2.3]
+  - [3, -2]
+  - [4, -1.3]
+  - [5, -1]
+  - [6, 0]
+  - [7, 1]
+  - [8, 1.3]
+  - [9, 2]
+  - [10, 2.3]
+  - [0, 9.4972150816945e+19]
+...
+i1:alter{parts = {2, 'number'}}
+---
+...
+_ = s:insert{11, -1.2}
+---
+...
+_ = s:insert{12, 1.2}
+---
+...
+_ = s:insert{13, -5}
+---
+...
+_ = s:insert{14, 5}
+---
+...
+i1:select()
+---
+- - [1, -9.4972150816945e+19]
+  - [13, -5]
+  - [2, -2.3]
+  - [3, -2]
+  - [4, -1.3]
+  - [11, -1.2]
+  - [5, -1]
+  - [6, 0]
+  - [7, 1]
+  - [12, 1.2]
+  - [8, 1.3]
+  - [9, 2]
+  - [10, 2.3]
+  - [14, 5]
+  - [0, 9.4972150816945e+19]
+...
+_ = i1:delete({-2.3})
+---
+...
+_ = i1:delete({-1.3})
+---
+...
+_ = i1:delete({1.3})
+---
+...
+_ = i1:delete({2.3})
+---
+...
+_ = i1:delete({-1.2})
+---
+...
+_ = i1:delete({1.2})
+---
+...
+_ = i1:delete({bdbl})
+---
+...
+_ = i1:delete({nbdbl})
+---
+...
+i1:alter{parts = {2, 'integer'}}
+---
+...
+_ = s:insert{15, -4}
+---
+...
+_ = s:insert{16, 4}
+---
+...
+i1:select()
+---
+- - [13, -5]
+  - [15, -4]
+  - [3, -2]
+  - [5, -1]
+  - [6, 0]
+  - [7, 1]
+  - [9, 2]
+  - [16, 4]
+  - [14, 5]
+...
+_ = i1:delete({-5})
+---
+...
+_ = i1:delete({-4})
+---
+...
+_ = i1:delete({-2})
+---
+...
+_ = i1:delete({-1})
+---
+...
+i1:alter{parts = {2, 'unsigned'}}
+---
+...
+_ = s:insert({17, 10})
+---
+...
+i1:alter{parts = {2, 'scalar'}}
+---
+...
+_ = s:insert({18, 1.1})
+---
+...
+i1:select()
+---
+- - [6, 0]
+  - [7, 1]
+  - [18, 1.1]
+  - [9, 2]
+  - [16, 4]
+  - [14, 5]
+  - [17, 10]
+...
+s:drop()
+---
+...
+--
 -- Creating/altering a secondary index of a non-empty space.
 --
 s = box.schema.space.create('test', {engine = engine})
diff --git a/test/engine/ddl.test.lua b/test/engine/ddl.test.lua
index cdaf7a5bf..a73075498 100644
--- a/test/engine/ddl.test.lua
+++ b/test/engine/ddl.test.lua
@@ -710,6 +710,55 @@ _ = s:create_index('sk', {parts = {2, 'unsigned'}})
 c:get()
 s:drop()
 
+--
+-- gh-3961: Introduce tuple comparison hints
+--
+s = box.schema.space.create('test', {engine = engine})
+pk = s:create_index('pk')
+i1 = s:create_index('i1', {unique = true,  parts = {2, 'scalar'}})
+bdbl = (2.1)^62;
+nbdbl = -bdbl
+_ = s:insert{0, bdbl}
+_ = s:insert{1, nbdbl}
+_ = s:insert{2, -2.3}
+_ = s:insert{3, -2}
+_ = s:insert{4, -1.3}
+_ = s:insert{5, -1}
+_ = s:insert{6, 0}
+_ = s:insert{7, 1}
+_ = s:insert{8, 1.3}
+_ = s:insert{9, 2}
+_ = s:insert{10, 2.3}
+i1:select()
+i1:alter{parts = {2, 'number'}}
+_ = s:insert{11, -1.2}
+_ = s:insert{12, 1.2}
+_ = s:insert{13, -5}
+_ = s:insert{14, 5}
+i1:select()
+_ = i1:delete({-2.3})
+_ = i1:delete({-1.3})
+_ = i1:delete({1.3})
+_ = i1:delete({2.3})
+_ = i1:delete({-1.2})
+_ = i1:delete({1.2})
+_ = i1:delete({bdbl})
+_ = i1:delete({nbdbl})
+i1:alter{parts = {2, 'integer'}}
+_ = s:insert{15, -4}
+_ = s:insert{16, 4}
+i1:select()
+_ = i1:delete({-5})
+_ = i1:delete({-4})
+_ = i1:delete({-2})
+_ = i1:delete({-1})
+i1:alter{parts = {2, 'unsigned'}}
+_ = s:insert({17, 10})
+i1:alter{parts = {2, 'scalar'}}
+_ = s:insert({18, 1.1})
+i1:select()
+s:drop()
+
 --
 -- Creating/altering a secondary index of a non-empty space.
 --
-- 
2.21.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
  2019-03-14  7:04   ` Vladimir Davydov
@ 2019-03-15 10:20     ` Kirill Shcherbatov
  2019-03-15 10:55       ` Kirill Shcherbatov
  2019-03-19 19:38       ` Vladimir Davydov
  0 siblings, 2 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-15 10:20 UTC (permalink / raw)
  To: tarantool-patches, Vladimir Davydov

Refactored key_def_set_compare_func routine to make it easier to
read, maintain, and extend. This is necessary due to the fact
that in a series of subsequent patches it will be significantly
expanded.

Needed for #3961
---
 src/box/tuple_compare.cc | 195 +++++++++++++++++++--------------------
 1 file changed, 93 insertions(+), 102 deletions(-)

diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index dfe30b190..7fe1766a8 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -986,7 +986,7 @@ struct comparator_signature {
 /**
  * field1 no, field1 type, field2 no, field2 type, ...
  */
-static const comparator_signature cmp_arr[] = {
+static const comparator_signature precompiled_cmp_arr[] = {
 	COMPARATOR(0, FIELD_TYPE_UNSIGNED)
 	COMPARATOR(0, FIELD_TYPE_STRING)
 	COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED)
@@ -1005,55 +1005,6 @@ static const comparator_signature cmp_arr[] = {
 
 #undef COMPARATOR
 
-static const tuple_compare_t compare_slowpath_funcs[] = {
-	tuple_compare_slowpath<false, false, false>,
-	tuple_compare_slowpath<true, false, false>,
-	tuple_compare_slowpath<false, true, false>,
-	tuple_compare_slowpath<true, true, false>,
-	tuple_compare_slowpath<false, false, true>,
-	tuple_compare_slowpath<true, false, true>,
-	tuple_compare_slowpath<false, true, true>,
-	tuple_compare_slowpath<true, true, true>
-};
-
-static tuple_compare_t
-tuple_compare_create(const struct key_def *def)
-{
-	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
-			   2 * (def->has_optional_parts ? 1 : 0) +
-			   4 * (def->has_json_paths ? 1 : 0);
-	if (def->is_nullable) {
-		if (key_def_is_sequential(def)) {
-			if (def->has_optional_parts)
-				return tuple_compare_sequential<true, true>;
-			else
-				return tuple_compare_sequential<true, false>;
-		} else {
-			return compare_slowpath_funcs[cmp_func_idx];
-		}
-	}
-	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def) && !def->has_json_paths) {
-		/* Precalculated comparators don't use collation */
-		for (uint32_t k = 0;
-		     k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
-			uint32_t i = 0;
-			for (; i < def->part_count; i++)
-				if (def->parts[i].fieldno !=
-				    cmp_arr[k].p[i * 2] ||
-				    def->parts[i].type !=
-				    cmp_arr[k].p[i * 2 + 1])
-					break;
-			if (i == def->part_count &&
-			    cmp_arr[k].p[i * 2] == UINT32_MAX)
-				return cmp_arr[k].f;
-		}
-	}
-	return key_def_is_sequential(def) ?
-	       tuple_compare_sequential<false, false> :
-	       compare_slowpath_funcs[cmp_func_idx];
-}
-
 /* }}} tuple_compare */
 
 /* {{{ tuple_compare_with_key */
@@ -1215,7 +1166,7 @@ struct comparator_with_key_signature
 #define KEY_COMPARATOR(...) \
 	{ TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
 
-static const comparator_with_key_signature cmp_wk_arr[] = {
+static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
 	KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
 	KEY_COMPARATOR(0, FIELD_TYPE_STRING  , 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
 	KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_STRING  , 2, FIELD_TYPE_UNSIGNED)
@@ -1231,68 +1182,108 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
 	KEY_COMPARATOR(1, FIELD_TYPE_STRING  , 2, FIELD_TYPE_STRING)
 };
 
-#undef KEY_COMPARATOR
-
-static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
-	tuple_compare_with_key_slowpath<false, false, false>,
-	tuple_compare_with_key_slowpath<true, false, false>,
-	tuple_compare_with_key_slowpath<false, true, false>,
-	tuple_compare_with_key_slowpath<true, true, false>,
-	tuple_compare_with_key_slowpath<false, false, true>,
-	tuple_compare_with_key_slowpath<true, false, true>,
-	tuple_compare_with_key_slowpath<false, true, true>,
-	tuple_compare_with_key_slowpath<true, true, true>
+#undef KEY_COMPARATORq
+
+#define COMPARE_SLOWPATH(...) \
+	{tuple_compare_slowpath<__VA_ARGS__>, \
+	 tuple_compare_with_key_slowpath<__VA_ARGS__>, \
+	 __VA_ARGS__, false}
+
+#define COMPARE_SEQUENTIAL(...) \
+	{tuple_compare_sequential<__VA_ARGS__>, \
+	 tuple_compare_with_key_sequential<__VA_ARGS__>,  \
+	 __VA_ARGS__, false, true}
+
+static struct  {
+	tuple_compare_t tuple_compare;
+	tuple_compare_with_key_t tuple_compare_with_key;
+	bool is_nullable;
+	bool has_optional_parts;
+	bool has_json_paths;
+	bool is_sequential;
+} cmp_arr[] = {
+	COMPARE_SLOWPATH(false, false, false),
+	COMPARE_SLOWPATH(true, false, false),
+	COMPARE_SLOWPATH(false, true, false),
+	COMPARE_SLOWPATH(true, true, false),
+	COMPARE_SLOWPATH(false, false, true),
+	COMPARE_SLOWPATH(true, false, true),
+	COMPARE_SLOWPATH(false, true, true),
+	COMPARE_SLOWPATH(true, true, true),
+	COMPARE_SEQUENTIAL(false, false),
+	COMPARE_SEQUENTIAL(true, false),
+	COMPARE_SEQUENTIAL(false, true),
+	COMPARE_SEQUENTIAL(true, true),
 };
 
-static tuple_compare_with_key_t
-tuple_compare_with_key_create(const struct key_def *def)
+#undef COMPARE_SLOWPATH
+#undef COMPARE_SEQUENTIAL
+
+/* }}} tuple_compare_with_key */
+
+void
+key_def_set_compare_func(struct key_def *def)
 {
-	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
-			   2 * (def->has_optional_parts ? 1 : 0) +
-			   4 * (def->has_json_paths ? 1 : 0);
-	if (def->is_nullable) {
-		if (key_def_is_sequential(def)) {
-			if (def->has_optional_parts) {
-				return tuple_compare_with_key_sequential<true,
-									 true>;
-			} else {
-				return tuple_compare_with_key_sequential<true,
-									 false>;
+	def->tuple_compare = NULL;
+	def->tuple_compare_with_key = NULL;
+	if (!def->is_nullable && !key_def_has_collation(def) &&
+	    !def->has_json_paths) {
+		/* Precalculated comparators don't use collation */
+		for (uint32_t k = 0; k < sizeof(precompiled_cmp_arr) /
+				 sizeof(precompiled_cmp_arr[0]); k++) {
+			uint32_t i = 0;
+			for (; i < def->part_count; i++) {
+				if (def->parts[i].fieldno !=
+				precompiled_cmp_arr[k].p[i * 2] ||
+				def->parts[i].type !=
+				precompiled_cmp_arr[k].p[i * 2 + 1])
+					break;
+			}
+			if (i == def->part_count &&
+			    precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) {
+				def->tuple_compare = precompiled_cmp_arr[k].f;
+				break;
 			}
-		} else {
-			return compare_with_key_slowpath_funcs[cmp_func_idx];
 		}
-	}
-	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def) && !def->has_json_paths) {
-		/* Precalculated comparators don't use collation */
-		for (uint32_t k = 0;
-		     k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
-		     k++) {
-
+		for (uint32_t k = 0; k < sizeof(precompiled_cmp_wk_arr) /
+				 sizeof(precompiled_cmp_wk_arr[0]); k++) {
 			uint32_t i = 0;
 			for (; i < def->part_count; i++) {
 				if (def->parts[i].fieldno !=
-				    cmp_wk_arr[k].p[i * 2] ||
-				    def->parts[i].type !=
-				    cmp_wk_arr[k].p[i * 2 + 1]) {
+				precompiled_cmp_wk_arr[k].p[i * 2] ||
+				def->parts[i].type !=
+				precompiled_cmp_wk_arr[k].p[i * 2 + 1])
 					break;
+			}
+			if (i == def->part_count) {
+				def->tuple_compare_with_key =
+					precompiled_cmp_wk_arr[k].f;
+				break;
+			}
+		}
+	}
+	if (def->tuple_compare == NULL || def->tuple_compare_with_key == NULL) {
+		for (uint32_t k = 0; k < sizeof(cmp_arr) /
+					 sizeof(cmp_arr[0]); k++) {
+			if (def->is_nullable == cmp_arr[k].is_nullable &&
+			    def->has_optional_parts ==
+			    cmp_arr[k].has_optional_parts &&
+			    def->has_json_paths == cmp_arr[k].has_json_paths &&
+			    key_def_is_sequential(def) ==
+			    cmp_arr[k].is_sequential) {
+				if (def->tuple_compare == NULL) {
+					def->tuple_compare =
+						cmp_arr[k].tuple_compare;
 				}
+				if (def->tuple_compare_with_key == NULL) {
+					def->tuple_compare_with_key =
+						cmp_arr[k].
+						tuple_compare_with_key;
+				}
+				break;
 			}
-			if (i == def->part_count)
-				return cmp_wk_arr[k].f;
 		}
 	}
-	return key_def_is_sequential(def) ?
-	       tuple_compare_with_key_sequential<false, false> :
-	       compare_with_key_slowpath_funcs[cmp_func_idx];
-}
-
-/* }}} tuple_compare_with_key */
-
-void
-key_def_set_compare_func(struct key_def *def)
-{
-	def->tuple_compare = tuple_compare_create(def);
-	def->tuple_compare_with_key = tuple_compare_with_key_create(def);
+	assert(def->tuple_compare != NULL &&
+	       def->tuple_compare_with_key != NULL);
 }
-- 
2.21.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
  2019-03-15 10:20     ` [tarantool-patches] " Kirill Shcherbatov
@ 2019-03-15 10:55       ` Kirill Shcherbatov
  2019-03-19 19:38       ` Vladimir Davydov
  1 sibling, 0 replies; 11+ messages in thread
From: Kirill Shcherbatov @ 2019-03-15 10:55 UTC (permalink / raw)
  To: tarantool-patches, Vladimir Davydov

@@ -1182,7 +1182,7 @@ static const comparator_with_key_signature precompiled_cmp_wk_arr[] = {
        KEY_COMPARATOR(1, FIELD_TYPE_STRING  , 2, FIELD_TYPE_STRING)
 };
 
-#undef KEY_COMPARATORq
+#undef KEY_COMPARATOR

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tarantool-patches] Re: [PATCH v6 1/3] box: refactor key_def_set_compare_func routine
  2019-03-15 10:20     ` [tarantool-patches] " Kirill Shcherbatov
  2019-03-15 10:55       ` Kirill Shcherbatov
@ 2019-03-19 19:38       ` Vladimir Davydov
  1 sibling, 0 replies; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-19 19:38 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Fri, Mar 15, 2019 at 01:20:11PM +0300, Kirill Shcherbatov wrote:
> +#define COMPARE_SLOWPATH(...) \
> +	{tuple_compare_slowpath<__VA_ARGS__>, \
> +	 tuple_compare_with_key_slowpath<__VA_ARGS__>, \
> +	 __VA_ARGS__, false}
> +
> +#define COMPARE_SEQUENTIAL(...) \
> +	{tuple_compare_sequential<__VA_ARGS__>, \
> +	 tuple_compare_with_key_sequential<__VA_ARGS__>,  \
> +	 __VA_ARGS__, false, true}
> +
> +static struct  {
> +	tuple_compare_t tuple_compare;
> +	tuple_compare_with_key_t tuple_compare_with_key;
> +	bool is_nullable;
> +	bool has_optional_parts;
> +	bool has_json_paths;
> +	bool is_sequential;
> +} cmp_arr[] = {
> +	COMPARE_SLOWPATH(false, false, false),
> +	COMPARE_SLOWPATH(true, false, false),
> +	COMPARE_SLOWPATH(false, true, false),
> +	COMPARE_SLOWPATH(true, true, false),
> +	COMPARE_SLOWPATH(false, false, true),
> +	COMPARE_SLOWPATH(true, false, true),
> +	COMPARE_SLOWPATH(false, true, true),
> +	COMPARE_SLOWPATH(true, true, true),
> +	COMPARE_SEQUENTIAL(false, false),
> +	COMPARE_SEQUENTIAL(true, false),
> +	COMPARE_SEQUENTIAL(false, true),
> +	COMPARE_SEQUENTIAL(true, true),
>  };
>  
> -static tuple_compare_with_key_t
> -tuple_compare_with_key_create(const struct key_def *def)
> +#undef COMPARE_SLOWPATH
> +#undef COMPARE_SEQUENTIAL
> +
> +/* }}} tuple_compare_with_key */
> +
> +void
> +key_def_set_compare_func(struct key_def *def)
>  {
> -	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
> -			   2 * (def->has_optional_parts ? 1 : 0) +
> -			   4 * (def->has_json_paths ? 1 : 0);
> -	if (def->is_nullable) {
> -		if (key_def_is_sequential(def)) {
> -			if (def->has_optional_parts) {
> -				return tuple_compare_with_key_sequential<true,
> -									 true>;
> -			} else {
> -				return tuple_compare_with_key_sequential<true,
> -									 false>;
> +	def->tuple_compare = NULL;
> +	def->tuple_compare_with_key = NULL;
> +	if (!def->is_nullable && !key_def_has_collation(def) &&
> +	    !def->has_json_paths) {
> +		/* Precalculated comparators don't use collation */
> +		for (uint32_t k = 0; k < sizeof(precompiled_cmp_arr) /
> +				 sizeof(precompiled_cmp_arr[0]); k++) {
> +			uint32_t i = 0;
> +			for (; i < def->part_count; i++) {
> +				if (def->parts[i].fieldno !=
> +				precompiled_cmp_arr[k].p[i * 2] ||
> +				def->parts[i].type !=
> +				precompiled_cmp_arr[k].p[i * 2 + 1])
> +					break;
> +			}
> +			if (i == def->part_count &&
> +			    precompiled_cmp_arr[k].p[i * 2] == UINT32_MAX) {
> +				def->tuple_compare = precompiled_cmp_arr[k].f;
> +				break;
>  			}
> -		} else {
> -			return compare_with_key_slowpath_funcs[cmp_func_idx];
>  		}
> -	}
> -	assert(! def->has_optional_parts);
> -	if (!key_def_has_collation(def) && !def->has_json_paths) {
> -		/* Precalculated comparators don't use collation */
> -		for (uint32_t k = 0;
> -		     k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
> -		     k++) {
> -
> +		for (uint32_t k = 0; k < sizeof(precompiled_cmp_wk_arr) /
> +				 sizeof(precompiled_cmp_wk_arr[0]); k++) {
>  			uint32_t i = 0;
>  			for (; i < def->part_count; i++) {
>  				if (def->parts[i].fieldno !=
> -				    cmp_wk_arr[k].p[i * 2] ||
> -				    def->parts[i].type !=
> -				    cmp_wk_arr[k].p[i * 2 + 1]) {
> +				precompiled_cmp_wk_arr[k].p[i * 2] ||
> +				def->parts[i].type !=
> +				precompiled_cmp_wk_arr[k].p[i * 2 + 1])
>  					break;
> +			}
> +			if (i == def->part_count) {
> +				def->tuple_compare_with_key =
> +					precompiled_cmp_wk_arr[k].f;
> +				break;
> +			}
> +		}
> +	}
> +	if (def->tuple_compare == NULL || def->tuple_compare_with_key == NULL) {
> +		for (uint32_t k = 0; k < sizeof(cmp_arr) /
> +					 sizeof(cmp_arr[0]); k++) {
> +			if (def->is_nullable == cmp_arr[k].is_nullable &&
> +			    def->has_optional_parts ==
> +			    cmp_arr[k].has_optional_parts &&
> +			    def->has_json_paths == cmp_arr[k].has_json_paths &&
> +			    key_def_is_sequential(def) ==
> +			    cmp_arr[k].is_sequential) {
> +				if (def->tuple_compare == NULL) {
> +					def->tuple_compare =
> +						cmp_arr[k].tuple_compare;
>  				}
> +				if (def->tuple_compare_with_key == NULL) {
> +					def->tuple_compare_with_key =
> +						cmp_arr[k].
> +						tuple_compare_with_key;
> +				}
> +				break;
>  			}
> -			if (i == def->part_count)
> -				return cmp_wk_arr[k].f;
>  		}
>  	}

All these unnamed booleans and variadic macros and linear search over
them look mind-boggling IMO. I understand that this approach is kinda
generic, but it just looks ugly. In fact, I don't see any point in such
a generalization at this point at all. See, although we'll have four
parameters - is_nullable, has_optional_parts, has_json_paths, and
is_multikey - they are not independent:

 - has_optional_parts depends on is_nullable
 - is_multikey depends on has_json_paths

So the number of combination is far not staggering, and I don't see any
new parameters coming in the near future.

So I dropped this patch and instead split key_def_set_compare_func into
sub-functions by key_def types. You will find the patch below. It should
be trivial to add hinted comparators and multikey support on top of it
without introducing deep if-else hiearchies or excessive code repetition.

---

From 829c811cdeb98bd8e5cb91edf2197f0c340d55ed Mon Sep 17 00:00:00 2001
Message-Id: <829c811cdeb98bd8e5cb91edf2197f0c340d55ed.1553024132.git.vdavydov.dev@gmail.com>
From: Vladimir Davydov <vdavydov.dev@gmail.com>
Date: Tue, 19 Mar 2019 20:27:42 +0300
Subject: [PATCH] tuple_compare: cleanup key_def virtual func setter

Over time key_def virtual func setter code has turned into a complete
mess: now it's an incomprehensible mix of if-else blocks, linear array
search, and devious array indexing. What is worse, it's duplicated by
tuple_compare_create and tuple_compare_with_key_create. Adding yet
another parameter to key_def templates, which is needed to implement
multi-key indexes, is literally impossible.

This patch attempts to alleviate the situation by splitting code paths
for pre-compiled, plain, and json indexes plus merging tuple_compare and
tuple_compare_with_key setters.
---
 src/box/tuple_compare.cc | 211 +++++++++++++++++++++++------------------------
 1 file changed, 105 insertions(+), 106 deletions(-)

diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index dfe30b19..19b396e2 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -1005,55 +1005,6 @@ static const comparator_signature cmp_arr[] = {
 
 #undef COMPARATOR
 
-static const tuple_compare_t compare_slowpath_funcs[] = {
-	tuple_compare_slowpath<false, false, false>,
-	tuple_compare_slowpath<true, false, false>,
-	tuple_compare_slowpath<false, true, false>,
-	tuple_compare_slowpath<true, true, false>,
-	tuple_compare_slowpath<false, false, true>,
-	tuple_compare_slowpath<true, false, true>,
-	tuple_compare_slowpath<false, true, true>,
-	tuple_compare_slowpath<true, true, true>
-};
-
-static tuple_compare_t
-tuple_compare_create(const struct key_def *def)
-{
-	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
-			   2 * (def->has_optional_parts ? 1 : 0) +
-			   4 * (def->has_json_paths ? 1 : 0);
-	if (def->is_nullable) {
-		if (key_def_is_sequential(def)) {
-			if (def->has_optional_parts)
-				return tuple_compare_sequential<true, true>;
-			else
-				return tuple_compare_sequential<true, false>;
-		} else {
-			return compare_slowpath_funcs[cmp_func_idx];
-		}
-	}
-	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def) && !def->has_json_paths) {
-		/* Precalculated comparators don't use collation */
-		for (uint32_t k = 0;
-		     k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
-			uint32_t i = 0;
-			for (; i < def->part_count; i++)
-				if (def->parts[i].fieldno !=
-				    cmp_arr[k].p[i * 2] ||
-				    def->parts[i].type !=
-				    cmp_arr[k].p[i * 2 + 1])
-					break;
-			if (i == def->part_count &&
-			    cmp_arr[k].p[i * 2] == UINT32_MAX)
-				return cmp_arr[k].f;
-		}
-	}
-	return key_def_is_sequential(def) ?
-	       tuple_compare_sequential<false, false> :
-	       compare_slowpath_funcs[cmp_func_idx];
-}
-
 /* }}} tuple_compare */
 
 /* {{{ tuple_compare_with_key */
@@ -1233,66 +1184,114 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
 
 #undef KEY_COMPARATOR
 
-static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
-	tuple_compare_with_key_slowpath<false, false, false>,
-	tuple_compare_with_key_slowpath<true, false, false>,
-	tuple_compare_with_key_slowpath<false, true, false>,
-	tuple_compare_with_key_slowpath<true, true, false>,
-	tuple_compare_with_key_slowpath<false, false, true>,
-	tuple_compare_with_key_slowpath<true, false, true>,
-	tuple_compare_with_key_slowpath<false, true, true>,
-	tuple_compare_with_key_slowpath<true, true, true>
-};
-
-static tuple_compare_with_key_t
-tuple_compare_with_key_create(const struct key_def *def)
-{
-	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
-			   2 * (def->has_optional_parts ? 1 : 0) +
-			   4 * (def->has_json_paths ? 1 : 0);
-	if (def->is_nullable) {
-		if (key_def_is_sequential(def)) {
-			if (def->has_optional_parts) {
-				return tuple_compare_with_key_sequential<true,
-									 true>;
-			} else {
-				return tuple_compare_with_key_sequential<true,
-									 false>;
-			}
-		} else {
-			return compare_with_key_slowpath_funcs[cmp_func_idx];
-		}
-	}
-	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def) && !def->has_json_paths) {
-		/* Precalculated comparators don't use collation */
-		for (uint32_t k = 0;
-		     k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
-		     k++) {
-
-			uint32_t i = 0;
-			for (; i < def->part_count; i++) {
-				if (def->parts[i].fieldno !=
-				    cmp_wk_arr[k].p[i * 2] ||
-				    def->parts[i].type !=
-				    cmp_wk_arr[k].p[i * 2 + 1]) {
-					break;
-				}
-			}
-			if (i == def->part_count)
-				return cmp_wk_arr[k].f;
-		}
-	}
-	return key_def_is_sequential(def) ?
-	       tuple_compare_with_key_sequential<false, false> :
-	       compare_with_key_slowpath_funcs[cmp_func_idx];
-}
-
 /* }}} tuple_compare_with_key */
 
+static void
+key_def_set_compare_func_fast(struct key_def *def)
+{
+	assert(!def->is_nullable);
+	assert(!def->has_optional_parts);
+	assert(!def->has_json_paths);
+	assert(!key_def_has_collation(def));
+
+	tuple_compare_t cmp = NULL;
+	tuple_compare_with_key_t cmp_wk = NULL;
+	bool is_sequential = key_def_is_sequential(def);
+
+	/*
+	 * Use pre-compiled comparators if available, otherwise
+	 * fall back on generic comparators.
+	 */
+	for (uint32_t k = 0; k < lengthof(cmp_arr); k++) {
+		uint32_t i = 0;
+		for (; i < def->part_count; i++)
+			if (def->parts[i].fieldno != cmp_arr[k].p[i * 2] ||
+			    def->parts[i].type != cmp_arr[k].p[i * 2 + 1])
+				break;
+		if (i == def->part_count && cmp_arr[k].p[i * 2] == UINT32_MAX) {
+			cmp = cmp_arr[k].f;
+			break;
+		}
+	}
+	for (uint32_t k = 0; k < lengthof(cmp_wk_arr); k++) {
+		uint32_t i = 0;
+		for (; i < def->part_count; i++) {
+			if (def->parts[i].fieldno != cmp_wk_arr[k].p[i * 2] ||
+			    def->parts[i].type != cmp_wk_arr[k].p[i * 2 + 1])
+				break;
+		}
+		if (i == def->part_count) {
+			cmp_wk = cmp_wk_arr[k].f;
+			break;
+		}
+	}
+	if (cmp == NULL) {
+		cmp = is_sequential ?
+			tuple_compare_sequential<false, false> :
+			tuple_compare_slowpath<false, false, false>;
+	}
+	if (cmp_wk == NULL) {
+		cmp_wk = is_sequential ?
+			tuple_compare_with_key_sequential<false, false> :
+			tuple_compare_with_key_slowpath<false, false, false>;
+	}
+
+	def->tuple_compare = cmp;
+	def->tuple_compare_with_key = cmp_wk;
+}
+
+template<bool is_nullable, bool has_optional_parts>
+static void
+key_def_set_compare_func_plain(struct key_def *def)
+{
+	assert(!def->has_json_paths);
+	if (key_def_is_sequential(def)) {
+		def->tuple_compare = tuple_compare_sequential
+					<is_nullable, has_optional_parts>;
+		def->tuple_compare_with_key = tuple_compare_with_key_sequential
+					<is_nullable, has_optional_parts>;
+	} else {
+		def->tuple_compare = tuple_compare_slowpath
+				<is_nullable, has_optional_parts, false>;
+		def->tuple_compare_with_key = tuple_compare_with_key_slowpath
+				<is_nullable, has_optional_parts, false>;
+	}
+}
+
+template<bool is_nullable, bool has_optional_parts>
+static void
+key_def_set_compare_func_json(struct key_def *def)
+{
+	assert(def->has_json_paths);
+	def->tuple_compare = tuple_compare_slowpath
+			<is_nullable, has_optional_parts, true>;
+	def->tuple_compare_with_key = tuple_compare_with_key_slowpath
+			<is_nullable, has_optional_parts, true>;
+}
+
 void
 key_def_set_compare_func(struct key_def *def)
 {
-	def->tuple_compare = tuple_compare_create(def);
-	def->tuple_compare_with_key = tuple_compare_with_key_create(def);
+	if (!key_def_has_collation(def) &&
+	    !def->is_nullable && !def->has_json_paths) {
+		key_def_set_compare_func_fast(def);
+	} else if (!def->has_json_paths) {
+		if (def->is_nullable && def->has_optional_parts) {
+			key_def_set_compare_func_plain<true, true>(def);
+		} else if (def->is_nullable && !def->has_optional_parts) {
+			key_def_set_compare_func_plain<true, false>(def);
+		} else {
+			assert(!def->is_nullable && !def->has_optional_parts);
+			key_def_set_compare_func_plain<false, false>(def);
+		}
+	} else {
+		if (def->is_nullable && def->has_optional_parts) {
+			key_def_set_compare_func_json<true, true>(def);
+		} else if (def->is_nullable && !def->has_optional_parts) {
+			key_def_set_compare_func_json<true, false>(def);
+		} else {
+			assert(!def->is_nullable && !def->has_optional_parts);
+			key_def_set_compare_func_json<false, false>(def);
+		}
+	}
 }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [tarantool-patches] Re: [PATCH v6 2/3] memtx: introduce tuple compare hint
  2019-03-15 10:20     ` [tarantool-patches] " Kirill Shcherbatov
@ 2019-03-20 18:08       ` Vladimir Davydov
  0 siblings, 0 replies; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-20 18:08 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

Pushed to master with some changes. You will find the new patch below.
I removed the tests, because I find it insufficient. Please cover all
possible test cases and submit a separate patch.
---

From 9fba29abb4e05babb9b23b4413bd8083f0fba933 Mon Sep 17 00:00:00 2001
Message-Id: <9fba29abb4e05babb9b23b4413bd8083f0fba933.1553105230.git.vdavydov.dev@gmail.com>
From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
Date: Wed, 13 Mar 2019 13:06:30 +0300
Subject: [PATCH] memtx: introduce tuple compare hint

Implement functions for retrieving tuple hints for a particular
key_def. Hint is an integer that can be used for tuple comparison
optimization: if a hint of one tuple is less than a hint of another
then the first tuple is definitely less than the second; only if
hints are equal tuple_compare must be called for getting comparison
result. Hints are calculated using only the first part of key_def.

@locker:
 - Rework key_def_set_hint_func.
 - Refactor functions calculating hints.
 - Drop has_collation template argument (we don't use templates
   for collations anywhere else).
 - Add part_count argument to key_hint (it's conventional to pass
   part_count along with decoded key).
 - Improve comments, rename a few functions, and cleanup code.

Close #3961
---
 src/box/key_def.h        | 115 ++++++++++
 src/box/memtx_tree.c     |  40 +++-
 src/box/tuple_compare.cc | 535 +++++++++++++++++++++++++++++++++++++++++++++--
 src/lib/coll/coll.c      |  26 +++
 src/lib/coll/coll.h      |  12 ++
 5 files changed, 702 insertions(+), 26 deletions(-)

diff --git a/src/box/key_def.h b/src/box/key_def.h
index 7c9f8c6b..288cf727 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -117,6 +117,27 @@ struct key_def;
 struct tuple;
 
 /**
+ * Tuple comparison hint h(t) is such a function of tuple t that
+ * the following conditions always hold for any pair of tuples
+ * t1 and t2:
+ *
+ *   if h(t1) < h(t2) then t1 < t2;
+ *   if h(t1) > h(t2) then t1 > t2;
+ *   if h(t1) == h(t2) then t1 may or may not be equal to t2.
+ *
+ * These rules mean that instead of direct tuple vs tuple
+ * (or tuple vs key) comparison one may compare their hints
+ * first and only if theirs hints equal compare the tuples
+ * themselves.
+ */
+typedef uint64_t hint_t;
+
+/**
+ * Reserved value to use when comparison hint is undefined.
+ */
+#define HINT_NONE ((hint_t)UINT64_MAX)
+
+/**
  * Get is_nullable property of key_part.
  * @param key_part for which attribute is being fetched
  *
@@ -133,10 +154,23 @@ typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
 					const char *key,
 					uint32_t part_count,
 					struct key_def *key_def);
+/** @copydoc tuple_compare_with_key_hinted() */
+typedef int (*tuple_compare_with_key_hinted_t)(const struct tuple *tuple,
+					       hint_t tuple_hint,
+					       const char *key,
+					       uint32_t part_count,
+					       hint_t key_hint,
+					       struct key_def *key_def);
 /** @copydoc tuple_compare() */
 typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
 			       const struct tuple *tuple_b,
 			       struct key_def *key_def);
+/** @copydoc tuple_compare_hinted() */
+typedef int (*tuple_compare_hinted_t)(const struct tuple *tuple_a,
+				      hint_t tuple_a_hint,
+				      const struct tuple *tuple_b,
+				      hint_t tuple_b_hint,
+				      struct key_def *key_def);
 /** @copydoc tuple_extract_key() */
 typedef char *(*tuple_extract_key_t)(const struct tuple *tuple,
 				     struct key_def *key_def,
@@ -152,13 +186,23 @@ typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple,
 /** @copydoc key_hash() */
 typedef uint32_t (*key_hash_t)(const char *key,
 				struct key_def *key_def);
+/** @copydoc tuple_hint() */
+typedef hint_t (*tuple_hint_t)(const struct tuple *tuple,
+			       struct key_def *key_def);
+/** @copydoc key_hint() */
+typedef hint_t (*key_hint_t)(const char *key, uint32_t part_count,
+			     struct key_def *key_def);
 
 /* Definition of a multipart key. */
 struct key_def {
 	/** @see tuple_compare() */
 	tuple_compare_t tuple_compare;
+	/** @see tuple_compare_hinted() */
+	tuple_compare_hinted_t tuple_compare_hinted;
 	/** @see tuple_compare_with_key() */
 	tuple_compare_with_key_t tuple_compare_with_key;
+	/** @see tuple_compare_with_key_hinted() */
+	tuple_compare_with_key_hinted_t tuple_compare_with_key_hinted;
 	/** @see tuple_extract_key() */
 	tuple_extract_key_t tuple_extract_key;
 	/** @see tuple_extract_key_raw() */
@@ -167,6 +211,10 @@ struct key_def {
 	tuple_hash_t tuple_hash;
 	/** @see key_hash() */
 	key_hash_t key_hash;
+	/** @see tuple_hint() */
+	tuple_hint_t tuple_hint;
+	/** @see key_hint() */
+	key_hint_t key_hint;
 	/**
 	 * Minimal part count which always is unique. For example,
 	 * if a secondary index is unique, then
@@ -558,6 +606,26 @@ tuple_compare(const struct tuple *tuple_a, const struct tuple *tuple_b,
 }
 
 /**
+ * Compare tuples using the key definition and comparison hints.
+ * @param tuple_a first tuple
+ * @param tuple_a_hint comparison hint of @a tuple_a
+ * @param tuple_b second tuple
+ * @param tuple_b_hint comparison hint of @a tuple_b
+ * @param key_def key definition
+ * @retval 0  if key_fields(tuple_a) == key_fields(tuple_b)
+ * @retval <0 if key_fields(tuple_a) < key_fields(tuple_b)
+ * @retval >0 if key_fields(tuple_a) > key_fields(tuple_b)
+ */
+static inline int
+tuple_compare_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+		     const struct tuple *tuple_b, hint_t tuple_b_hint,
+		     struct key_def *key_def)
+{
+	return key_def->tuple_compare_hinted(tuple_a, tuple_a_hint, tuple_b,
+					     tuple_b_hint, key_def);
+}
+
+/**
  * @brief Compare tuple with key using the key definition.
  * @param tuple tuple
  * @param key key parts without MessagePack array header
@@ -576,6 +644,29 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key,
 }
 
 /**
+ * Compare tuple with key using the key definition and
+ * comparison hints
+ * @param tuple tuple
+ * @param tuple_hint comparison hint of @a tuple
+ * @param key key parts without MessagePack array header
+ * @param part_count the number of parts in @a key
+ * @param key_hint comparison hint of @a key
+ * @param key_def key definition
+ * @retval 0  if key_fields(tuple) == parts(key)
+ * @retval <0 if key_fields(tuple) < parts(key)
+ * @retval >0 if key_fields(tuple) > parts(key)
+ */
+static inline int
+tuple_compare_with_key_hinted(const struct tuple *tuple, hint_t tuple_hint,
+			      const char *key, uint32_t part_count,
+			      hint_t key_hint, struct key_def *key_def)
+{
+	return key_def->tuple_compare_with_key_hinted(tuple, tuple_hint, key,
+						      part_count, key_hint,
+						      key_def);
+}
+
+/**
  * Compute hash of a tuple field.
  * @param ph1 - pointer to running hash
  * @param pcarry - pointer to carry
@@ -628,6 +719,30 @@ key_hash(const char *key, struct key_def *key_def)
 	return key_def->key_hash(key, key_def);
 }
 
+ /*
+ * Get comparison hint for a tuple.
+ * @param tuple - tuple to compute the hint for
+ * @param key_def - key_def used for tuple comparison
+ * @return - hint value
+ */
+static inline hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+	return key_def->tuple_hint(tuple, key_def);
+}
+
+/**
+ * Get a comparison hint of a key.
+ * @param key - key to compute the hint for
+ * @param key_def - key_def used for tuple comparison
+ * @return - hint value
+ */
+static inline hint_t
+key_hint(const char *key, uint32_t part_count, struct key_def *key_def)
+{
+	return key_def->key_hint(key, part_count, key_def);
+}
+
 #if defined(__cplusplus)
 } /* extern "C" */
 #endif /* defined(__cplusplus) */
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index e24c4032..fe037c54 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -47,6 +47,8 @@ struct memtx_tree_key_data {
 	const char *key;
 	/** Number of msgpacked search fields. */
 	uint32_t part_count;
+	/** Comparison hint, see tuple_hint(). */
+	hint_t hint;
 };
 
 /**
@@ -55,6 +57,8 @@ struct memtx_tree_key_data {
 struct memtx_tree_data {
 	/* Tuple that this node is represents. */
 	struct tuple *tuple;
+	/** Comparison hint, see key_hint(). */
+	hint_t hint;
 };
 
 /**
@@ -69,16 +73,18 @@ static bool
 memtx_tree_data_identical(const struct memtx_tree_data *a,
 			  const struct memtx_tree_data *b)
 {
-	return a->tuple == b->tuple;
+	return a->tuple == b->tuple && a->hint == b->hint;
 }
 
 #define BPS_TREE_NAME memtx_tree
 #define BPS_TREE_BLOCK_SIZE (512)
 #define BPS_TREE_EXTENT_SIZE MEMTX_EXTENT_SIZE
 #define BPS_TREE_COMPARE(a, b, arg)\
-	tuple_compare((&a)->tuple, (&b)->tuple, arg)
+	tuple_compare_hinted((&a)->tuple, (&a)->hint, (&b)->tuple,\
+			     (&b)->hint, arg)
 #define BPS_TREE_COMPARE_KEY(a, b, arg)\
-	tuple_compare_with_key((&a)->tuple, (b)->key, (b)->part_count, arg)
+	tuple_compare_with_key_hinted((&a)->tuple, (&a)->hint, (b)->key,\
+				      (b)->part_count, (b)->hint, arg)
 #define BPS_TREE_IDENTICAL(a, b) memtx_tree_data_identical(&a, &b)
 #define bps_tree_elem_t struct memtx_tree_data
 #define bps_tree_key_t struct memtx_tree_key_data *
@@ -119,7 +125,8 @@ memtx_tree_qcompare(const void* a, const void *b, void *c)
 	const struct memtx_tree_data *data_a = a;
 	const struct memtx_tree_data *data_b = b;
 	struct key_def *key_def = c;
-	return tuple_compare(data_a->tuple, data_b->tuple, key_def);
+	return tuple_compare_hinted(data_a->tuple, data_a->hint, data_b->tuple,
+				    data_b->hint, key_def);
 }
 
 /* {{{ MemtxTree Iterators ****************************************/
@@ -241,9 +248,11 @@ tree_iterator_next_equal(struct iterator *iterator, struct tuple **ret)
 		memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
 	/* Use user key def to save a few loops. */
 	if (res == NULL ||
-	    tuple_compare_with_key(res->tuple, it->key_data.key,
-				   it->key_data.part_count,
-				   it->index_def->key_def) != 0) {
+	    tuple_compare_with_key_hinted(res->tuple, res->hint,
+					  it->key_data.key,
+					  it->key_data.part_count,
+					  it->key_data.hint,
+					  it->index_def->key_def) != 0) {
 		iterator->next = tree_iterator_dummie;
 		it->current.tuple = NULL;
 		*ret = NULL;
@@ -272,9 +281,11 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret)
 		memtx_tree_iterator_get_elem(it->tree, &it->tree_iterator);
 	/* Use user key def to save a few loops. */
 	if (res == NULL ||
-	    tuple_compare_with_key(res->tuple, it->key_data.key,
-				   it->key_data.part_count,
-				   it->index_def->key_def) != 0) {
+	    tuple_compare_with_key_hinted(res->tuple, res->hint,
+					  it->key_data.key,
+					  it->key_data.part_count,
+					  it->key_data.hint,
+					  it->index_def->key_def) != 0) {
 		iterator->next = tree_iterator_dummie;
 		it->current.tuple = NULL;
 		*ret = NULL;
@@ -513,9 +524,11 @@ memtx_tree_index_get(struct index *base, const char *key,
 	assert(base->def->opts.is_unique &&
 	       part_count == base->def->key_def->part_count);
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
 	struct memtx_tree_key_data key_data;
 	key_data.key = key;
 	key_data.part_count = part_count;
+	key_data.hint = key_hint(key, part_count, cmp_def);
 	struct memtx_tree_data *res = memtx_tree_find(&index->tree, &key_data);
 	*result = res != NULL ? res->tuple : NULL;
 	return 0;
@@ -527,9 +540,11 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
 			 struct tuple **result)
 {
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
 	if (new_tuple) {
 		struct memtx_tree_data new_data;
 		new_data.tuple = new_tuple;
+		new_data.hint = tuple_hint(new_tuple, cmp_def);
 		struct memtx_tree_data dup_data;
 		dup_data.tuple = NULL;
 
@@ -562,6 +577,7 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple,
 	if (old_tuple) {
 		struct memtx_tree_data old_data;
 		old_data.tuple = old_tuple;
+		old_data.hint = tuple_hint(old_tuple, cmp_def);
 		memtx_tree_delete(&index->tree, old_data);
 	}
 	*result = old_tuple;
@@ -574,6 +590,7 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
 {
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
 	struct memtx_engine *memtx = (struct memtx_engine *)base->engine;
+	struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
 
 	assert(part_count == 0 || key != NULL);
 	if (type > ITER_GT) {
@@ -604,6 +621,7 @@ memtx_tree_index_create_iterator(struct index *base, enum iterator_type type,
 	it->type = type;
 	it->key_data.key = key;
 	it->key_data.part_count = part_count;
+	it->key_data.hint = key_hint(key, part_count, cmp_def);
 	it->index_def = base->def;
 	it->tree = &index->tree;
 	it->tree_iterator = memtx_tree_invalid_iterator();
@@ -641,6 +659,7 @@ static int
 memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
 {
 	struct memtx_tree_index *index = (struct memtx_tree_index *)base;
+	struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree);
 	if (index->build_array == NULL) {
 		index->build_array = malloc(MEMTX_EXTENT_SIZE);
 		if (index->build_array == NULL) {
@@ -668,6 +687,7 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple)
 	struct memtx_tree_data *elem =
 		&index->build_array[index->build_array_size++];
 	elem->tuple = tuple;
+	elem->hint = tuple_hint(tuple, cmp_def);
 	return 0;
 }
 
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 595f6f25..93756365 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -36,6 +36,25 @@
 
 /* {{{ tuple_compare */
 
+/**
+ * Compare two tuple hints.
+ *
+ * Returns:
+ *
+ *   -1 if the first tuple is less than the second tuple
+ *   +1 if the first tuple is greater than the second tuple
+ *    0 if the first tuple may be less than, equal to, or
+ *      greater than the second tuple, and a full tuple
+ *      comparison is needed to determine the order.
+ */
+static inline int
+hint_cmp(hint_t hint_a, hint_t hint_b)
+{
+	if (hint_a != HINT_NONE && hint_b != HINT_NONE && hint_a != hint_b)
+		return hint_a < hint_b ? -1 : 1;
+	return 0;
+}
+
 /*
  * Compare two tuple fields.
  * Separate version exists since compare is a very
@@ -53,7 +72,8 @@ enum mp_class {
 	MP_CLASS_STR,
 	MP_CLASS_BIN,
 	MP_CLASS_ARRAY,
-	MP_CLASS_MAP
+	MP_CLASS_MAP,
+	mp_class_max,
 };
 
 static enum mp_class mp_classes[] = {
@@ -427,13 +447,17 @@ tuple_compare_field_with_type(const char *field_a, enum mp_type a_type,
 
 template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
-tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
-		       struct key_def *key_def)
+tuple_compare_slowpath_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+			      const struct tuple *tuple_b, hint_t tuple_b_hint,
+			      struct key_def *key_def)
 {
 	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
+	int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+	if (rc != 0)
+		return rc;
 	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
@@ -469,7 +493,6 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	struct key_part *end;
 	const char *field_a, *field_b;
 	enum mp_type a_type, b_type;
-	int rc;
 	if (is_nullable)
 		end = part + key_def->unique_part_count;
 	else
@@ -559,8 +582,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 
 template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
-tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
-				uint32_t part_count, struct key_def *key_def)
+tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
+		       struct key_def *key_def)
+{
+	return tuple_compare_slowpath_hinted
+		<is_nullable, has_optional_parts, has_json_paths>
+		(tuple_a, HINT_NONE, tuple_b, HINT_NONE, key_def);
+}
+
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath_hinted(const struct tuple *tuple,
+		hint_t tuple_hint, const char *key, uint32_t part_count,
+		hint_t key_hint, struct key_def *key_def)
 {
 	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
@@ -568,6 +602,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key != NULL || part_count == 0);
 	assert(part_count <= key_def->part_count);
+	int rc = hint_cmp(tuple_hint, key_hint);
+	if (rc != 0)
+		return rc;
 	struct key_part *part = key_def->parts;
 	struct tuple_format *format = tuple_format(tuple);
 	const char *tuple_raw = tuple_data(tuple);
@@ -603,7 +640,6 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	}
 
 	struct key_part *end = part + part_count;
-	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
 		const char *field;
 		if (has_json_paths) {
@@ -642,6 +678,16 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	return 0;
 }
 
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
+static inline int
+tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
+				uint32_t part_count, struct key_def *key_def)
+{
+	return tuple_compare_with_key_slowpath_hinted
+		<is_nullable, has_optional_parts, has_json_paths>
+		(tuple, HINT_NONE, key, part_count, HINT_NONE, key_def);
+}
+
 template<bool is_nullable>
 static inline int
 key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
@@ -700,13 +746,17 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
 
 template<bool is_nullable, bool has_optional_parts>
 static inline int
-tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
-				  uint32_t part_count, struct key_def *key_def)
+tuple_compare_with_key_sequential_hinted(const struct tuple *tuple,
+		hint_t tuple_hint, const char *key, uint32_t part_count,
+		hint_t key_hint, struct key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(key_def_is_sequential(key_def));
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
+	int rc = hint_cmp(tuple_hint, key_hint);
+	if (rc != 0)
+		return rc;
 	const char *tuple_key = tuple_data(tuple);
 	uint32_t field_count = mp_decode_array(&tuple_key);
 	uint32_t cmp_part_count;
@@ -716,8 +766,8 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
 		assert(field_count >= part_count);
 		cmp_part_count = part_count;
 	}
-	int rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
-						key_def);
+	rc = key_compare_parts<is_nullable>(tuple_key, key, cmp_part_count,
+					    key_def);
 	if (!has_optional_parts || rc != 0)
 		return rc;
 	/*
@@ -739,6 +789,16 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
 	return 0;
 }
 
+template<bool is_nullable, bool has_optional_parts>
+static inline int
+tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
+				  uint32_t part_count, struct key_def *key_def)
+{
+	return tuple_compare_with_key_sequential_hinted
+		<is_nullable, has_optional_parts>
+		(tuple, HINT_NONE, key, part_count, HINT_NONE, key_def);
+}
+
 int
 key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
 {
@@ -759,13 +819,17 @@ key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
 
 template <bool is_nullable, bool has_optional_parts>
 static int
-tuple_compare_sequential(const struct tuple *tuple_a,
-			 const struct tuple *tuple_b, key_def *key_def)
+tuple_compare_sequential_hinted(const struct tuple *tuple_a, hint_t tuple_a_hint,
+				const struct tuple *tuple_b, hint_t tuple_b_hint,
+				struct key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key_def_is_sequential(key_def));
 	assert(is_nullable == key_def->is_nullable);
+	int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+	if (rc != 0)
+		return rc;
 	const char *key_a = tuple_data(tuple_a);
 	uint32_t fc_a = mp_decode_array(&key_a);
 	const char *key_b = tuple_data(tuple_b);
@@ -779,7 +843,6 @@ tuple_compare_sequential(const struct tuple *tuple_a,
 	bool was_null_met = false;
 	struct key_part *part = key_def->parts;
 	struct key_part *end = part + key_def->unique_part_count;
-	int rc;
 	uint32_t i = 0;
 	for (; part < end; ++part, ++i) {
 		enum mp_type a_type, b_type;
@@ -826,6 +889,16 @@ tuple_compare_sequential(const struct tuple *tuple_a,
 	return 0;
 }
 
+template <bool is_nullable, bool has_optional_parts>
+static int
+tuple_compare_sequential(const struct tuple *tuple_a,
+			 const struct tuple *tuple_b, struct key_def *key_def)
+{
+	return tuple_compare_sequential_hinted
+		<is_nullable, has_optional_parts>
+		(tuple_a, HINT_NONE, tuple_b, HINT_NONE, key_def);
+}
+
 template <int TYPE>
 static inline int
 field_compare(const char **field_a, const char **field_b);
@@ -956,6 +1029,18 @@ struct TupleCompare
 			compare(tuple_a, tuple_b, format_a,
 				format_b, field_a, field_b);
 	}
+
+	static int compare_hinted(const struct tuple *tuple_a,
+				  hint_t tuple_a_hint,
+				  const struct tuple *tuple_b,
+				  hint_t tuple_b_hint,
+				  struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple_a, tuple_b, key_def);
+	}
 };
 
 template <int TYPE, int ...MORE_TYPES>
@@ -973,15 +1058,30 @@ struct TupleCompare<0, TYPE, MORE_TYPES...> {
 		return FieldCompare<0, TYPE, MORE_TYPES...>::compare(tuple_a, tuple_b,
 					format_a, format_b, field_a, field_b);
 	}
+
+	static int compare_hinted(const struct tuple *tuple_a,
+				  hint_t tuple_a_hint,
+				  const struct tuple *tuple_b,
+				  hint_t tuple_b_hint,
+				  struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_a_hint, tuple_b_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple_a, tuple_b, key_def);
+	}
 };
 } /* end of anonymous namespace */
 
 struct comparator_signature {
 	tuple_compare_t f;
+	tuple_compare_hinted_t f_hinted;
 	uint32_t p[64];
 };
 #define COMPARATOR(...) \
-	{ TupleCompare<__VA_ARGS__>::compare, { __VA_ARGS__, UINT32_MAX } },
+	{ TupleCompare<__VA_ARGS__>::compare, \
+	  TupleCompare<__VA_ARGS__>::compare_hinted, \
+	  { __VA_ARGS__, UINT32_MAX } },
 
 /**
  * field1 no, field1 type, field2 no, field2 type, ...
@@ -1133,6 +1233,17 @@ struct TupleCompareWithKey
 				compare(tuple, key, part_count,
 					key_def, format, field);
 	}
+
+	static int
+	compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+		       const char *key, uint32_t part_count, hint_t key_hint,
+		       struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_hint, key_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple, key, part_count, key_def);
+	}
 };
 
 template <int TYPE, int ...MORE_TYPES>
@@ -1153,6 +1264,17 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
 			compare(tuple, key, part_count,
 				key_def, format, field);
 	}
+
+	static int
+	compare_hinted(const struct tuple *tuple, hint_t tuple_hint,
+		       const char *key, uint32_t part_count, hint_t key_hint,
+		       struct key_def *key_def)
+	{
+		int rc = hint_cmp(tuple_hint, key_hint);
+		if (rc != 0)
+			return rc;
+		return compare(tuple, key, part_count, key_def);
+	}
 };
 
 } /* end of anonymous namespace */
@@ -1160,11 +1282,14 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
 struct comparator_with_key_signature
 {
 	tuple_compare_with_key_t f;
+	tuple_compare_with_key_hinted_t f_hinted;
 	uint32_t p[64];
 };
 
 #define KEY_COMPARATOR(...) \
-	{ TupleCompareWithKey<0, __VA_ARGS__>::compare, { __VA_ARGS__ } },
+	{ TupleCompareWithKey<0, __VA_ARGS__>::compare, \
+	  TupleCompareWithKey<0, __VA_ARGS__>::compare_hinted, \
+	  { __VA_ARGS__ } },
 
 static const comparator_with_key_signature cmp_wk_arr[] = {
 	KEY_COMPARATOR(0, FIELD_TYPE_UNSIGNED, 1, FIELD_TYPE_UNSIGNED, 2, FIELD_TYPE_UNSIGNED)
@@ -1186,6 +1311,356 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
 
 /* }}} tuple_compare_with_key */
 
+/* {{{ tuple_hint */
+
+/**
+ * A comparison hint is an unsigned integer number that has
+ * the following layout:
+ *
+ *     [         class         |         value         ]
+ *      <-- HINT_CLASS_BITS --> <-- HINT_VALUE_BITS -->
+ *      <----------------- HINT_BITS ----------------->
+ *
+ * For simplicity we construct it using the first key part only;
+ * other key parts don't participate in hint construction. As a
+ * consequence, tuple hints are useless if the first key part
+ * doesn't differ among indexed tuples.
+ *
+ * Hint class stores one of mp_class enum values corresponding
+ * to the field type. We store it in upper bits of a hint so
+ * as to guarantee that tuple fields that have different types
+ * are compared correctly in a scalar index. We use the same
+ * layout for indexes of base types too so that we don't need
+ * to recalculate hints when altering an index from scalar to
+ * one of base types or vice versa without index rebuild.
+ *
+ * Hint value is stored in lower bits of a hint and chosen so
+ * as to satisfy the hint property (see the comment to hint_t).
+ * Also, hint values must be compatible among all kinds of numbers
+ * (integer, unsigned, floating point) so that we can alter an
+ * index between any two of them without touching the hints.
+ *
+ * The value depends on the field type:
+ *
+ *  - For an integer field, the hint value equals the number
+ *    stored in the field minus the minimal (negative) integer
+ *    number that fits in a hint. If the number is too large
+ *    or too small to fit in a hint, we use the max or the min
+ *    number that fits. This guarantees correct order for all
+ *    positive and negative integers.
+ *
+ *  - For a field storing a floating point number, we use
+ *    the hint computed for the integral part. This ensures
+ *    compatibility between floating point and integer field
+ *    hints.
+ *
+ *  - For a boolean field, the hint value is simply 0 or 1.
+ *
+ *  - For a string field, we take the first few characters that
+ *    fit in a hint and shift them in such a way that comparing
+ *    them is equivalent to strcmp(). If there's a collation,
+ *    we use the sort key instead of the string (see coll->hint).
+ *
+ *  - For a field containing NULL, the value is 0, and we rely on
+ *    mp_class comparison rules for arranging nullable fields.
+ */
+#define HINT_BITS		(sizeof(hint_t) * CHAR_BIT)
+#define HINT_CLASS_BITS		4
+#define HINT_VALUE_BITS		(HINT_BITS - HINT_CLASS_BITS)
+
+/** Number of bytes that fit in a hint value. */
+#define HINT_VALUE_BYTES	(HINT_VALUE_BITS / CHAR_BIT)
+
+/** Max unsigned integer that can be stored in a hint value. */
+#define HINT_VALUE_MAX		((1ULL << HINT_VALUE_BITS) - 1)
+
+/**
+ * Max and min signed integer numbers that fit in a hint value.
+ * For numbers > MAX and < MIN we store MAX and MIN, respectively.
+ */
+#define HINT_VALUE_INT_MAX	((1LL << (HINT_VALUE_BITS - 1)) - 1)
+#define HINT_VALUE_INT_MIN	(-(1LL << (HINT_VALUE_BITS - 1)))
+
+/**
+ * Max and min floating point numbers whose integral parts fit
+ * in a hint value. Note, we can't compare a floating point number
+ * with HINT_VALUE_INT_{MIN,MAX} because of rounding errors.
+ */
+#define HINT_VALUE_DOUBLE_MAX	(exp2(HINT_VALUE_BITS - 1) - 1)
+#define HINT_VALUE_DOUBLE_MIN	(-exp2(HINT_VALUE_BITS - 1))
+
+/*
+ * HINT_CLASS_BITS should be big enough to store any mp_class value.
+ * Note, ((1 << HINT_CLASS_BITS) - 1) is reserved for HINT_NONE.
+ */
+static_assert(mp_class_max < (1 << HINT_CLASS_BITS) - 1,
+	      "mp_class must fit in tuple hint");
+
+static inline hint_t
+hint_create(enum mp_class c, uint64_t val)
+{
+	assert((val >> HINT_VALUE_BITS) == 0);
+	return (hint_t)(((uint64_t)c << HINT_VALUE_BITS) | val);
+}
+
+static inline hint_t
+hint_nil(void)
+{
+	return hint_create(MP_CLASS_NIL, 0);
+}
+
+static inline hint_t
+hint_bool(bool b)
+{
+	return hint_create(MP_CLASS_BOOL, b ? 1 : 0);
+}
+
+static inline hint_t
+hint_uint(uint64_t u)
+{
+	uint64_t val = (u >= (uint64_t)HINT_VALUE_INT_MAX ?
+			HINT_VALUE_MAX : u - HINT_VALUE_INT_MIN);
+	return hint_create(MP_CLASS_NUMBER, val);
+}
+
+static inline hint_t
+hint_int(int64_t i)
+{
+	assert(i < 0);
+	uint64_t val = (i <= HINT_VALUE_INT_MIN ? 0 : i - HINT_VALUE_INT_MIN);
+	return hint_create(MP_CLASS_NUMBER, val);
+}
+
+static inline hint_t
+hint_double(double d)
+{
+	if (!isfinite(d))
+		return HINT_NONE;
+
+	uint64_t val;
+	if (d >= HINT_VALUE_DOUBLE_MAX)
+		val = HINT_VALUE_MAX;
+	else if (d <= HINT_VALUE_DOUBLE_MIN)
+		val = 0;
+	else
+		val = (int64_t)d - HINT_VALUE_INT_MIN;
+
+	return hint_create(MP_CLASS_NUMBER, val);
+}
+
+static inline uint64_t
+hint_str_raw(const char *s, uint32_t len)
+{
+	len = MIN(len, HINT_VALUE_BYTES);
+	uint64_t val = 0;
+	for (uint32_t i = 0; i < len; i++) {
+		val <<= CHAR_BIT;
+		val |= (unsigned char)s[i];
+	}
+	val <<= CHAR_BIT * (HINT_VALUE_BYTES - len);
+	return val;
+}
+
+static inline hint_t
+hint_str(const char *s, uint32_t len)
+{
+	uint64_t val = hint_str_raw(s, len);
+	return hint_create(MP_CLASS_STR, val);
+}
+
+static inline hint_t
+hint_str_coll(const char *s, uint32_t len, struct coll *coll)
+{
+	char buf[HINT_VALUE_BYTES];
+	uint32_t buf_len = coll->hint(s, len, buf, sizeof(buf), coll);
+	uint64_t val = hint_str_raw(buf, buf_len);
+	return hint_create(MP_CLASS_STR, val);
+}
+
+static inline hint_t
+hint_bin(const char *s, uint32_t len)
+{
+	uint64_t val = hint_str_raw(s, len);
+	return hint_create(MP_CLASS_BIN, val);
+}
+
+static inline hint_t
+field_hint_boolean(const char *field)
+{
+	assert(mp_typeof(*field) == MP_BOOL);
+	return hint_bool(mp_decode_bool(&field));
+}
+
+static inline hint_t
+field_hint_unsigned(const char *field)
+{
+	assert(mp_typeof(*field) == MP_UINT);
+	return hint_uint(mp_decode_uint(&field));
+}
+
+static inline hint_t
+field_hint_integer(const char *field)
+{
+	switch (mp_typeof(*field)) {
+	case MP_UINT:
+		return hint_uint(mp_decode_uint(&field));
+	case MP_INT:
+		return hint_int(mp_decode_int(&field));
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+static inline hint_t
+field_hint_number(const char *field)
+{
+	switch (mp_typeof(*field)) {
+	case MP_UINT:
+		return hint_uint(mp_decode_uint(&field));
+	case MP_INT:
+		return hint_int(mp_decode_int(&field));
+	case MP_FLOAT:
+		return hint_double(mp_decode_float(&field));
+	case MP_DOUBLE:
+		return hint_double(mp_decode_double(&field));
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+static inline hint_t
+field_hint_string(const char *field, struct coll *coll)
+{
+	assert(mp_typeof(*field) == MP_STR);
+	uint32_t len = mp_decode_strl(&field);
+	return coll == NULL ? hint_str(field, len) :
+			      hint_str_coll(field, len, coll);
+}
+
+static inline hint_t
+field_hint_scalar(const char *field, struct coll *coll)
+{
+	uint32_t len;
+	switch(mp_typeof(*field)) {
+	case MP_BOOL:
+		return hint_bool(mp_decode_bool(&field));
+	case MP_UINT:
+		return hint_uint(mp_decode_uint(&field));
+	case MP_INT:
+		return hint_int(mp_decode_int(&field));
+	case MP_FLOAT:
+		return hint_double(mp_decode_float(&field));
+	case MP_DOUBLE:
+		return hint_double(mp_decode_double(&field));
+	case MP_STR:
+		len = mp_decode_strl(&field);
+		return coll == NULL ? hint_str(field, len) :
+				      hint_str_coll(field, len, coll);
+	case MP_BIN:
+		len = mp_decode_binl(&field);
+		return hint_bin(field, len);
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable>
+static inline hint_t
+field_hint(const char *field, struct coll *coll)
+{
+	if (is_nullable && mp_typeof(*field) == MP_NIL)
+		return hint_nil();
+	switch (type) {
+	case FIELD_TYPE_BOOLEAN:
+		return field_hint_boolean(field);
+	case FIELD_TYPE_UNSIGNED:
+		return field_hint_unsigned(field);
+	case FIELD_TYPE_INTEGER:
+		return field_hint_integer(field);
+	case FIELD_TYPE_NUMBER:
+		return field_hint_number(field);
+	case FIELD_TYPE_STRING:
+		return field_hint_string(field, coll);
+	case FIELD_TYPE_SCALAR:
+		return field_hint_scalar(field, coll);
+	default:
+		unreachable();
+	}
+	return HINT_NONE;
+}
+
+template <enum field_type type, bool is_nullable>
+static hint_t
+key_hint(const char *key, uint32_t part_count, struct key_def *key_def)
+{
+	if (part_count == 0)
+		return HINT_NONE;
+	return field_hint<type, is_nullable>(key, key_def->parts->coll);
+}
+
+template <enum field_type type, bool is_nullable>
+static hint_t
+tuple_hint(const struct tuple *tuple, struct key_def *key_def)
+{
+	const char *field = tuple_field_by_part(tuple, key_def->parts);
+	if (is_nullable && field == NULL)
+		return hint_nil();
+	return field_hint<type, is_nullable>(field, key_def->parts->coll);
+}
+
+template<enum field_type type, bool is_nullable>
+static void
+key_def_set_hint_func(struct key_def *def)
+{
+	def->key_hint = key_hint<type, is_nullable>;
+	def->tuple_hint = tuple_hint<type, is_nullable>;
+}
+
+template<enum field_type type>
+static void
+key_def_set_hint_func(struct key_def *def)
+{
+	if (key_part_is_nullable(def->parts))
+		key_def_set_hint_func<type, true>(def);
+	else
+		key_def_set_hint_func<type, false>(def);
+}
+
+static void
+key_def_set_hint_func(struct key_def *def)
+{
+	switch (def->parts->type) {
+	case FIELD_TYPE_BOOLEAN:
+		key_def_set_hint_func<FIELD_TYPE_BOOLEAN>(def);
+		break;
+	case FIELD_TYPE_UNSIGNED:
+		key_def_set_hint_func<FIELD_TYPE_UNSIGNED>(def);
+		break;
+	case FIELD_TYPE_INTEGER:
+		key_def_set_hint_func<FIELD_TYPE_INTEGER>(def);
+		break;
+	case FIELD_TYPE_NUMBER:
+		key_def_set_hint_func<FIELD_TYPE_NUMBER>(def);
+		break;
+	case FIELD_TYPE_STRING:
+		key_def_set_hint_func<FIELD_TYPE_STRING>(def);
+		break;
+	case FIELD_TYPE_SCALAR:
+		key_def_set_hint_func<FIELD_TYPE_SCALAR>(def);
+		break;
+	default:
+		/* Invalid key definition. */
+		def->key_hint = NULL;
+		def->tuple_hint = NULL;
+		break;
+	}
+}
+
+/* }}} tuple_hint */
+
 static void
 key_def_set_compare_func_fast(struct key_def *def)
 {
@@ -1195,7 +1670,9 @@ key_def_set_compare_func_fast(struct key_def *def)
 	assert(!key_def_has_collation(def));
 
 	tuple_compare_t cmp = NULL;
+	tuple_compare_hinted_t cmp_hinted = NULL;
 	tuple_compare_with_key_t cmp_wk = NULL;
+	tuple_compare_with_key_hinted_t cmp_wk_hinted = NULL;
 	bool is_sequential = key_def_is_sequential(def);
 
 	/*
@@ -1210,6 +1687,7 @@ key_def_set_compare_func_fast(struct key_def *def)
 				break;
 		if (i == def->part_count && cmp_arr[k].p[i * 2] == UINT32_MAX) {
 			cmp = cmp_arr[k].f;
+			cmp_hinted = cmp_arr[k].f_hinted;
 			break;
 		}
 	}
@@ -1222,6 +1700,7 @@ key_def_set_compare_func_fast(struct key_def *def)
 		}
 		if (i == def->part_count) {
 			cmp_wk = cmp_wk_arr[k].f;
+			cmp_wk_hinted = cmp_wk_arr[k].f_hinted;
 			break;
 		}
 	}
@@ -1229,15 +1708,23 @@ key_def_set_compare_func_fast(struct key_def *def)
 		cmp = is_sequential ?
 			tuple_compare_sequential<false, false> :
 			tuple_compare_slowpath<false, false, false>;
+		cmp_hinted = is_sequential ?
+			tuple_compare_sequential_hinted<false, false> :
+			tuple_compare_slowpath_hinted<false, false, false>;
 	}
 	if (cmp_wk == NULL) {
 		cmp_wk = is_sequential ?
 			tuple_compare_with_key_sequential<false, false> :
 			tuple_compare_with_key_slowpath<false, false, false>;
+		cmp_wk_hinted = is_sequential ?
+			tuple_compare_with_key_sequential_hinted<false, false> :
+			tuple_compare_with_key_slowpath_hinted<false, false, false>;
 	}
 
 	def->tuple_compare = cmp;
+	def->tuple_compare_hinted = cmp_hinted;
 	def->tuple_compare_with_key = cmp_wk;
+	def->tuple_compare_with_key_hinted = cmp_wk_hinted;
 }
 
 template<bool is_nullable, bool has_optional_parts>
@@ -1248,13 +1735,23 @@ key_def_set_compare_func_plain(struct key_def *def)
 	if (key_def_is_sequential(def)) {
 		def->tuple_compare = tuple_compare_sequential
 					<is_nullable, has_optional_parts>;
+		def->tuple_compare_hinted = tuple_compare_sequential_hinted
+					<is_nullable, has_optional_parts>;
 		def->tuple_compare_with_key = tuple_compare_with_key_sequential
 					<is_nullable, has_optional_parts>;
+		def->tuple_compare_with_key_hinted =
+					tuple_compare_with_key_sequential_hinted
+					<is_nullable, has_optional_parts>;
 	} else {
 		def->tuple_compare = tuple_compare_slowpath
 				<is_nullable, has_optional_parts, false>;
+		def->tuple_compare_hinted = tuple_compare_slowpath_hinted
+				<is_nullable, has_optional_parts, false>;
 		def->tuple_compare_with_key = tuple_compare_with_key_slowpath
 				<is_nullable, has_optional_parts, false>;
+		def->tuple_compare_with_key_hinted =
+					tuple_compare_with_key_slowpath_hinted
+					<is_nullable, has_optional_parts, false>;
 	}
 }
 
@@ -1265,8 +1762,13 @@ key_def_set_compare_func_json(struct key_def *def)
 	assert(def->has_json_paths);
 	def->tuple_compare = tuple_compare_slowpath
 			<is_nullable, has_optional_parts, true>;
+	def->tuple_compare_hinted = tuple_compare_slowpath_hinted
+			<is_nullable, has_optional_parts, true>;
 	def->tuple_compare_with_key = tuple_compare_with_key_slowpath
 			<is_nullable, has_optional_parts, true>;
+	def->tuple_compare_with_key_hinted =
+			tuple_compare_with_key_slowpath_hinted
+			<is_nullable, has_optional_parts, true>;
 }
 
 void
@@ -1294,4 +1796,5 @@ key_def_set_compare_func(struct key_def *def)
 			key_def_set_compare_func_json<false, false>(def);
 		}
 	}
+	key_def_set_hint_func(def);
 }
diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
index afc15e80..b83f0fdc 100644
--- a/src/lib/coll/coll.c
+++ b/src/lib/coll/coll.c
@@ -133,6 +133,30 @@ coll_bin_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
 	return s_len;
 }
 
+static size_t
+coll_icu_hint(const char *s, size_t s_len, char *buf, size_t buf_len,
+	      struct coll *coll)
+{
+	assert(coll->type == COLL_TYPE_ICU);
+	UCharIterator itr;
+	uiter_setUTF8(&itr, s, s_len);
+	uint32_t state[2] = {0, 0};
+	UErrorCode status = U_ZERO_ERROR;
+	return ucol_nextSortKeyPart(coll->collator, &itr, state,
+				    (uint8_t *)buf, buf_len, &status);
+}
+
+static size_t
+coll_bin_hint(const char *s, size_t s_len, char *buf, size_t buf_len,
+	      struct coll *coll)
+{
+	(void)coll;
+	assert(coll->type == COLL_TYPE_BINARY);
+	size_t len = MIN(s_len, buf_len);
+	memcpy(buf, s, len);
+	return len;
+}
+
 /**
  * Set up ICU collator and init cmp and hash members of collation.
  * @param coll Collation to set up.
@@ -242,6 +266,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
 	}
 	coll->cmp = coll_icu_cmp;
 	coll->hash = coll_icu_hash;
+	coll->hint = coll_icu_hint;
 	return 0;
 }
 
@@ -356,6 +381,7 @@ coll_new(const struct coll_def *def)
 		coll->collator = NULL;
 		coll->cmp = coll_bin_cmp;
 		coll->hash = coll_bin_hash;
+		coll->hint = coll_bin_hint;
 		break;
 	default:
 		unreachable();
diff --git a/src/lib/coll/coll.h b/src/lib/coll/coll.h
index 8c9f9429..c505804f 100644
--- a/src/lib/coll/coll.h
+++ b/src/lib/coll/coll.h
@@ -48,6 +48,9 @@ typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
 typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
 				uint32_t *pcarry, struct coll *coll);
 
+typedef size_t (*coll_hint_f)(const char *s, size_t s_len, char *buf,
+			      size_t buf_len, struct coll *coll);
+
 struct UCollator;
 
 /**
@@ -61,7 +64,16 @@ struct coll {
 	struct UCollator *collator;
 	/** String comparator. */
 	coll_cmp_f cmp;
+	/** String hash function. */
 	coll_hash_f hash;
+	/**
+	 * String comparison hint.
+	 *
+	 * This function copies a sort key for a string to
+	 * the given buffer and returns the number of bytes
+	 * copied. Sort keys may be compared using strcmp().
+	 */
+	coll_hint_f hint;
 	/** Reference counter. */
 	int refs;
 	/**

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-03-20 18:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-13 12:15 [PATCH v6 0/3] box: introduce hint option for memtx tree index Kirill Shcherbatov
2019-03-13 12:15 ` [PATCH v6 1/3] box: refactor key_def_set_compare_func routine Kirill Shcherbatov
2019-03-14  7:04   ` Vladimir Davydov
2019-03-15 10:20     ` [tarantool-patches] " Kirill Shcherbatov
2019-03-15 10:55       ` Kirill Shcherbatov
2019-03-19 19:38       ` Vladimir Davydov
2019-03-13 12:15 ` [PATCH v6 2/3] memtx: introduce tuple compare hint Kirill Shcherbatov
2019-03-14  8:19   ` Vladimir Davydov
2019-03-15 10:20     ` [tarantool-patches] " Kirill Shcherbatov
2019-03-20 18:08       ` Vladimir Davydov
2019-03-13 12:15 ` [PATCH v6 3/3] box: introduce multikey indexes Kirill Shcherbatov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox