[PATCH v3.1 0/5] box: indexes by JSON path

Tarantool development patches archive
 help / color / mirror / Atom feed

* [PATCH v3.1 0/5] box: indexes by JSON path
@ 2018-09-20  7:46 Kirill Shcherbatov
  2018-09-20  7:46 ` [PATCH v3.1 1/5] box: refactor API to use non-constant key_def Kirill Shcherbatov
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Kirill Shcherbatov @ 2018-09-20  7:46 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, Kirill Shcherbatov

Branch: http://github.com/tarantool/tarantool/tree/kshch/gh-1012-json-indexes
Issue: https://github.com/tarantool/tarantool/issues/1012

Sometimes field data could have complex document structure.
When this structure is consistent across whole document,
you are able to create an index by JSON path.
This came possible with auxiliary structures per tuple_format:
tree of intermediate path fields and hashtable refers to leaf field
that use path as key.
To speed-up data access by JSON index key_part structure extended
with offset_slot cache that points to field_map item containing
data offset for current tuple.
Finally, supported ability to define JSON paths in user-friendly
form containing format field name(that could be changed).

Changes in v3.1:
  - patch is divided into several parts
  - fixed invalid bsize for vy_stmt_new_surrogate_from_key
    typles

Kirill Shcherbatov (5):
  box: refactor API to use non-constant key_def
  box: introduce tuple_field_by_part routine
  box: introduce JSON indexes
  box: introduce offset slot cache in key_part
  box: specify indexes in user-friendly form

 src/box/alter.cc                |   7 +-
 src/box/blackhole.c             |   5 +-
 src/box/engine.h                |  11 +-
 src/box/errcode.h               |   2 +-
 src/box/index_def.c             |  10 +-
 src/box/key_def.c               | 316 ++++++++++++--
 src/box/key_def.h               |  76 ++--
 src/box/lua/index.c             |  73 ++++
 src/box/lua/schema.lua          |  20 +-
 src/box/lua/space.cc            |   5 +
 src/box/memtx_bitset.c          |   5 +-
 src/box/memtx_engine.c          |   9 +-
 src/box/memtx_hash.h            |   4 +-
 src/box/memtx_rtree.c           |   3 +-
 src/box/memtx_space.c           |   5 +-
 src/box/memtx_space.h           |   2 +-
 src/box/schema.cc               |  16 +-
 src/box/space.c                 |   4 +-
 src/box/space.h                 |   8 +-
 src/box/sysview.c               |   3 +-
 src/box/tuple.c                 |  16 +-
 src/box/tuple.h                 |  14 +
 src/box/tuple_bloom.c           |   8 +-
 src/box/tuple_bloom.h           |   8 +-
 src/box/tuple_compare.cc        | 188 ++++++---
 src/box/tuple_compare.h         |   5 +-
 src/box/tuple_extract_key.cc    | 147 ++++---
 src/box/tuple_format.c          | 881 ++++++++++++++++++++++++++++++++++------
 src/box/tuple_format.h          |  85 +++-
 src/box/tuple_hash.cc           |  77 ++--
 src/box/tuple_hash.h            |   9 +-
 src/box/vinyl.c                 |  12 +-
 src/box/vy_history.c            |   2 +-
 src/box/vy_history.h            |   2 +-
 src/box/vy_log.c                |   3 +-
 src/box/vy_lsm.c                |  46 ++-
 src/box/vy_mem.c                |   8 +-
 src/box/vy_mem.h                |  10 +-
 src/box/vy_point_lookup.c       |   2 -
 src/box/vy_range.c              |   2 +-
 src/box/vy_range.h              |   4 +-
 src/box/vy_run.c                |  39 +-
 src/box/vy_run.h                |  34 +-
 src/box/vy_stmt.c               | 163 ++++++--
 src/box/vy_stmt.h               |  39 +-
 src/box/vy_upsert.c             |   2 +-
 src/box/vy_upsert.h             |   2 +-
 src/box/vy_write_iterator.c     |   8 +-
 src/box/vy_write_iterator.h     |   6 +-
 test/box/misc.result            |  57 +--
 test/engine/iterator.result     |   2 +-
 test/engine/tuple.result        | 428 +++++++++++++++++++
 test/engine/tuple.test.lua      | 121 ++++++
 test/unit/vy_iterators_helper.c |   6 +-
 test/unit/vy_mem.c              |   2 +-
 test/unit/vy_point_lookup.c     |   2 +-
 56 files changed, 2462 insertions(+), 562 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3.1 1/5] box: refactor API to use non-constant key_def
  2018-09-20  7:46 [PATCH v3.1 0/5] box: indexes by JSON path Kirill Shcherbatov
@ 2018-09-20  7:46 ` Kirill Shcherbatov
  2018-09-21  9:30   ` Vladimir Davydov
  2018-09-20  7:46 ` [PATCH v3.1 2/5] box: introduce tuple_field_by_part routine Kirill Shcherbatov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Kirill Shcherbatov @ 2018-09-20  7:46 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, Kirill Shcherbatov

To introduce JSON indexes we need changeable key_def containing
key_part definition that would store JSON path and offset slot
and slot epoch in following patches.

Part of #1012.
---
 src/box/key_def.c            |  4 ++--
 src/box/key_def.h            | 27 ++++++++++-----------
 src/box/memtx_hash.h         |  4 ++--
 src/box/tuple_bloom.c        |  8 +++----
 src/box/tuple_bloom.h        |  8 +++----
 src/box/tuple_compare.cc     | 57 ++++++++++++++++++++------------------------
 src/box/tuple_compare.h      |  5 ++--
 src/box/tuple_extract_key.cc | 11 ++++-----
 src/box/tuple_hash.cc        | 23 +++++++++---------
 src/box/tuple_hash.h         |  9 ++++---
 src/box/vy_history.c         |  2 +-
 src/box/vy_history.h         |  2 +-
 src/box/vy_mem.c             |  8 +++----
 src/box/vy_mem.h             | 10 ++++----
 src/box/vy_range.c           |  2 +-
 src/box/vy_range.h           |  4 ++--
 src/box/vy_run.c             | 39 +++++++++++++-----------------
 src/box/vy_run.h             | 34 ++++++++++++--------------
 src/box/vy_stmt.c            | 15 +++++-------
 src/box/vy_stmt.h            | 30 +++++++++++------------
 src/box/vy_upsert.c          |  2 +-
 src/box/vy_upsert.h          |  2 +-
 src/box/vy_write_iterator.c  |  8 +++----
 src/box/vy_write_iterator.h  |  6 ++---
 24 files changed, 145 insertions(+), 175 deletions(-)

diff --git a/src/box/key_def.c b/src/box/key_def.c
index ee09dc9..e8a6fa4 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -208,14 +208,14 @@ box_key_def_delete(box_key_def_t *key_def)
 
 int
 box_tuple_compare(const box_tuple_t *tuple_a, const box_tuple_t *tuple_b,
-		  const box_key_def_t *key_def)
+		  box_key_def_t *key_def)
 {
 	return tuple_compare(tuple_a, tuple_b, key_def);
 }
 
 int
 box_tuple_compare_with_key(const box_tuple_t *tuple_a, const char *key_b,
-			   const box_key_def_t *key_def)
+			   box_key_def_t *key_def)
 {
 	uint32_t part_count = mp_decode_array(&key_b);
 	return tuple_compare_with_key(tuple_a, key_b, part_count, key_def);
diff --git a/src/box/key_def.h b/src/box/key_def.h
index aecbe03..6cc705b 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -83,26 +83,26 @@ struct tuple;
 typedef int (*tuple_compare_with_key_t)(const struct tuple *tuple_a,
 					const char *key,
 					uint32_t part_count,
-					const struct key_def *key_def);
+					struct key_def *key_def);
 /** @copydoc tuple_compare() */
 typedef int (*tuple_compare_t)(const struct tuple *tuple_a,
 			       const struct tuple *tuple_b,
-			       const struct key_def *key_def);
+			       struct key_def *key_def);
 /** @copydoc tuple_extract_key() */
 typedef char *(*tuple_extract_key_t)(const struct tuple *tuple,
-				     const struct key_def *key_def,
+				     struct key_def *key_def,
 				     uint32_t *key_size);
 /** @copydoc tuple_extract_key_raw() */
 typedef char *(*tuple_extract_key_raw_t)(const char *data,
 					 const char *data_end,
-					 const struct key_def *key_def,
+					 struct key_def *key_def,
 					 uint32_t *key_size);
 /** @copydoc tuple_hash() */
 typedef uint32_t (*tuple_hash_t)(const struct tuple *tuple,
-				 const struct key_def *key_def);
+				 struct key_def *key_def);
 /** @copydoc key_hash() */
 typedef uint32_t (*key_hash_t)(const char *key,
-				const struct key_def *key_def);
+				struct key_def *key_def);
 
 /* Definition of a multipart key. */
 struct key_def {
@@ -201,7 +201,7 @@ box_key_def_delete(box_key_def_t *key_def);
  */
 int
 box_tuple_compare(const box_tuple_t *tuple_a, const box_tuple_t *tuple_b,
-		  const box_key_def_t *key_def);
+		  box_key_def_t *key_def);
 
 /**
  * @brief Compare tuple with key using the key definition.
@@ -216,7 +216,7 @@ box_tuple_compare(const box_tuple_t *tuple_a, const box_tuple_t *tuple_b,
 
 int
 box_tuple_compare_with_key(const box_tuple_t *tuple_a, const char *key_b,
-			   const box_key_def_t *key_def);
+			   box_key_def_t *key_def);
 
 /** \endcond public */
 
@@ -443,7 +443,7 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
  * @retval NULL     Memory allocation error
  */
 static inline char *
-tuple_extract_key(const struct tuple *tuple, const struct key_def *key_def,
+tuple_extract_key(const struct tuple *tuple, struct key_def *key_def,
 		  uint32_t *key_size)
 {
 	return key_def->tuple_extract_key(tuple, key_def, key_size);
@@ -464,7 +464,7 @@ tuple_extract_key(const struct tuple *tuple, const struct key_def *key_def,
  */
 static inline char *
 tuple_extract_key_raw(const char *data, const char *data_end,
-		      const struct key_def *key_def, uint32_t *key_size)
+		      struct key_def *key_def, uint32_t *key_size)
 {
 	return key_def->tuple_extract_key_raw(data, data_end, key_def,
 					      key_size);
@@ -483,8 +483,7 @@ tuple_extract_key_raw(const char *data, const char *data_end,
  * @retval >0 if key_a > key_b
  */
 int
-key_compare(const char *key_a, const char *key_b,
-	    const struct key_def *key_def);
+key_compare(const char *key_a, const char *key_b, struct key_def *key_def);
 
 /**
  * Compare tuples using the key definition.
@@ -497,7 +496,7 @@ key_compare(const char *key_a, const char *key_b,
  */
 static inline int
 tuple_compare(const struct tuple *tuple_a, const struct tuple *tuple_b,
-	      const struct key_def *key_def)
+	      struct key_def *key_def)
 {
 	return key_def->tuple_compare(tuple_a, tuple_b, key_def);
 }
@@ -515,7 +514,7 @@ tuple_compare(const struct tuple *tuple_a, const struct tuple *tuple_b,
  */
 static inline int
 tuple_compare_with_key(const struct tuple *tuple, const char *key,
-		       uint32_t part_count, const struct key_def *key_def)
+		       uint32_t part_count, struct key_def *key_def)
 {
 	return key_def->tuple_compare_with_key(tuple, key, part_count, key_def);
 }
diff --git a/src/box/memtx_hash.h b/src/box/memtx_hash.h
index a3b4805..10663fc 100644
--- a/src/box/memtx_hash.h
+++ b/src/box/memtx_hash.h
@@ -39,14 +39,14 @@ extern "C" {
 
 static inline bool
 memtx_hash_equal(struct tuple *tuple_a, struct tuple *tuple_b,
-		 const struct key_def *key_def)
+		 struct key_def *key_def)
 {
 	return tuple_compare(tuple_a, tuple_b, key_def) == 0;
 }
 
 static inline bool
 memtx_hash_equal_key(struct tuple *tuple, const char *key,
-		     const struct key_def *key_def)
+		     struct key_def *key_def)
 {
 	return tuple_compare_with_key(tuple, key, key_def->part_count,
 				      key_def) == 0;
diff --git a/src/box/tuple_bloom.c b/src/box/tuple_bloom.c
index ffad151..dc40698 100644
--- a/src/box/tuple_bloom.c
+++ b/src/box/tuple_bloom.c
@@ -74,8 +74,7 @@ tuple_bloom_builder_delete(struct tuple_bloom_builder *builder)
 
 int
 tuple_bloom_builder_add(struct tuple_bloom_builder *builder,
-			const struct tuple *tuple,
-			const struct key_def *key_def,
+			const struct tuple *tuple, struct key_def *key_def,
 			uint32_t hashed_parts)
 {
 	assert(builder->part_count == key_def->part_count);
@@ -168,8 +167,7 @@ tuple_bloom_delete(struct tuple_bloom *bloom)
 
 bool
 tuple_bloom_maybe_has(const struct tuple_bloom *bloom,
-		      const struct tuple *tuple,
-		      const struct key_def *key_def)
+		      const struct tuple *tuple, struct key_def *key_def)
 {
 	if (bloom->is_legacy) {
 		return bloom_maybe_has(&bloom->parts[0],
@@ -195,7 +193,7 @@ tuple_bloom_maybe_has(const struct tuple_bloom *bloom,
 bool
 tuple_bloom_maybe_has_key(const struct tuple_bloom *bloom,
 			  const char *key, uint32_t part_count,
-			  const struct key_def *key_def)
+			  struct key_def *key_def)
 {
 	if (bloom->is_legacy) {
 		if (part_count < key_def->part_count)
diff --git a/src/box/tuple_bloom.h b/src/box/tuple_bloom.h
index 505933d..b05fee1 100644
--- a/src/box/tuple_bloom.h
+++ b/src/box/tuple_bloom.h
@@ -117,8 +117,7 @@ tuple_bloom_builder_delete(struct tuple_bloom_builder *builder);
  */
 int
 tuple_bloom_builder_add(struct tuple_bloom_builder *builder,
-			const struct tuple *tuple,
-			const struct key_def *key_def,
+			const struct tuple *tuple, struct key_def *key_def,
 			uint32_t hashed_parts);
 
 /**
@@ -147,8 +146,7 @@ tuple_bloom_delete(struct tuple_bloom *bloom);
  */
 bool
 tuple_bloom_maybe_has(const struct tuple_bloom *bloom,
-		      const struct tuple *tuple,
-		      const struct key_def *key_def);
+		      const struct tuple *tuple, struct key_def *key_def);
 
 /**
  * Check if a tuple matching a key was stored in a tuple bloom filter.
@@ -162,7 +160,7 @@ tuple_bloom_maybe_has(const struct tuple_bloom *bloom,
 bool
 tuple_bloom_maybe_has_key(const struct tuple_bloom *bloom,
 			  const char *key, uint32_t part_count,
-			  const struct key_def *key_def);
+			  struct key_def *key_def);
 
 /**
  * Return the size of a tuple bloom filter when encoded.
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index e53afba..f9b4597 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -426,9 +426,8 @@ tuple_compare_field_with_hint(const char *field_a, enum mp_type a_type,
 }
 
 uint32_t
-tuple_common_key_parts(const struct tuple *tuple_a,
-		       const struct tuple *tuple_b,
-		       const struct key_def *key_def)
+tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
+		       struct key_def *key_def)
 {
 	uint32_t i;
 	for (i = 0; i < key_def->part_count; i++) {
@@ -452,12 +451,12 @@ tuple_common_key_parts(const struct tuple *tuple_a,
 template<bool is_nullable, bool has_optional_parts>
 static inline int
 tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
-		       const struct key_def *key_def)
+		       struct key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
-	const struct key_part *part = key_def->parts;
+	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
 	if (key_def->part_count == 1 && part->fieldno == 0) {
@@ -488,7 +487,7 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	const struct tuple_format *format_b = tuple_format(tuple_b);
 	const uint32_t *field_map_a = tuple_field_map(tuple_a);
 	const uint32_t *field_map_b = tuple_field_map(tuple_b);
-	const struct key_part *end;
+	struct key_part *end;
 	const char *field_a, *field_b;
 	enum mp_type a_type, b_type;
 	int rc;
@@ -568,15 +567,14 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 template<bool is_nullable, bool has_optional_parts>
 static inline int
 tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
-				uint32_t part_count,
-				const struct key_def *key_def)
+				uint32_t part_count, struct key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(key != NULL || part_count == 0);
 	assert(part_count <= key_def->part_count);
-	const struct key_part *part = key_def->parts;
+	struct key_part *part = key_def->parts;
 	const struct tuple_format *format = tuple_format(tuple);
 	const char *tuple_raw = tuple_data(tuple);
 	const uint32_t *field_map = tuple_field_map(tuple);
@@ -605,7 +603,7 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 		}
 	}
 
-	const struct key_part *end = part + part_count;
+	struct key_part *end = part + part_count;
 	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
 		const char *field;
@@ -643,11 +641,11 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 template<bool is_nullable>
 static inline int
 key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
-		  const struct key_def *key_def)
+		  struct key_def *key_def)
 {
 	assert(is_nullable == key_def->is_nullable);
 	assert((key_a != NULL && key_b != NULL) || part_count == 0);
-	const struct key_part *part = key_def->parts;
+	struct key_part *part = key_def->parts;
 	if (likely(part_count == 1)) {
 		if (! is_nullable) {
 			return tuple_compare_field(key_a, key_b, part->type,
@@ -667,7 +665,7 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
 		}
 	}
 
-	const struct key_part *end = part + part_count;
+	struct key_part *end = part + part_count;
 	int rc;
 	for (; part < end; ++part, mp_next(&key_a), mp_next(&key_b)) {
 		if (! is_nullable) {
@@ -699,8 +697,7 @@ key_compare_parts(const char *key_a, const char *key_b, uint32_t part_count,
 template<bool is_nullable, bool has_optional_parts>
 static inline int
 tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
-				  uint32_t part_count,
-				  const struct key_def *key_def)
+				  uint32_t part_count, struct key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(key_def_is_sequential(key_def));
@@ -739,8 +736,7 @@ tuple_compare_with_key_sequential(const struct tuple *tuple, const char *key,
 }
 
 int
-key_compare(const char *key_a, const char *key_b,
-	    const struct key_def *key_def)
+key_compare(const char *key_a, const char *key_b, struct key_def *key_def)
 {
 	uint32_t part_count_a = mp_decode_array(&key_a);
 	uint32_t part_count_b = mp_decode_array(&key_b);
@@ -760,8 +756,7 @@ key_compare(const char *key_a, const char *key_b,
 template <bool is_nullable, bool has_optional_parts>
 static int
 tuple_compare_sequential(const struct tuple *tuple_a,
-			 const struct tuple *tuple_b,
-			 const struct key_def *key_def)
+			 const struct tuple *tuple_b, key_def *key_def)
 {
 	assert(!has_optional_parts || is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
@@ -778,8 +773,8 @@ tuple_compare_sequential(const struct tuple *tuple_a,
 						key_def->part_count, key_def);
 	}
 	bool was_null_met = false;
-	const struct key_part *part = key_def->parts;
-	const struct key_part *end = part + key_def->unique_part_count;
+	struct key_part *part = key_def->parts;
+	struct key_part *end = part + key_def->unique_part_count;
 	int rc;
 	uint32_t i = 0;
 	for (; part < end; ++part, ++i) {
@@ -944,7 +939,7 @@ struct TupleCompare
 {
 	static int compare(const struct tuple *tuple_a,
 			   const struct tuple *tuple_b,
-			   const struct key_def *)
+			   struct key_def *)
 	{
 		struct tuple_format *format_a = tuple_format(tuple_a);
 		struct tuple_format *format_b = tuple_format(tuple_b);
@@ -963,7 +958,7 @@ template <int TYPE, int ...MORE_TYPES>
 struct TupleCompare<0, TYPE, MORE_TYPES...> {
 	static int compare(const struct tuple *tuple_a,
 			   const struct tuple *tuple_b,
-			   const struct key_def *)
+			   struct key_def *)
 	{
 		struct tuple_format *format_a = tuple_format(tuple_a);
 		struct tuple_format *format_b = tuple_format(tuple_b);
@@ -1115,7 +1110,7 @@ struct FieldCompareWithKey<FLD_ID, IDX, TYPE, IDX2, TYPE2, MORE_TYPES...>
 {
 	inline static int
 	compare(const struct tuple *tuple, const char *key,
-		uint32_t part_count, const struct key_def *key_def,
+		uint32_t part_count, struct key_def *key_def,
 		const struct tuple_format *format, const char *field)
 	{
 		int r;
@@ -1141,11 +1136,11 @@ struct FieldCompareWithKey<FLD_ID, IDX, TYPE, IDX2, TYPE2, MORE_TYPES...>
 template <int FLD_ID, int IDX, int TYPE>
 struct FieldCompareWithKey<FLD_ID, IDX, TYPE> {
 	inline static int compare(const struct tuple *,
-					 const char *key,
-					 uint32_t,
-					 const struct key_def *,
-					 const struct tuple_format *,
-					 const char *field)
+				  const char *key,
+				  uint32_t,
+				  struct key_def *,
+				  const struct tuple_format *,
+				  const char *field)
 	{
 		return field_compare_with_key<TYPE>(&field, &key);
 	}
@@ -1159,7 +1154,7 @@ struct TupleCompareWithKey
 {
 	static int
 	compare(const struct tuple *tuple, const char *key,
-		uint32_t part_count, const struct key_def *key_def)
+		uint32_t part_count, struct key_def *key_def)
 	{
 		/* Part count can be 0 in wildcard searches. */
 		if (part_count == 0)
@@ -1180,7 +1175,7 @@ struct TupleCompareWithKey<0, 0, TYPE, MORE_TYPES...>
 	static int compare(const struct tuple *tuple,
 				  const char *key,
 				  uint32_t part_count,
-				  const struct key_def *key_def)
+				  struct key_def *key_def)
 	{
 		/* Part count can be 0 in wildcard searches. */
 		if (part_count == 0)
diff --git a/src/box/tuple_compare.h b/src/box/tuple_compare.h
index 2a9875a..e3a6320 100644
--- a/src/box/tuple_compare.h
+++ b/src/box/tuple_compare.h
@@ -49,9 +49,8 @@ extern "C" {
  * @return number of key parts the two tuples have in common
  */
 uint32_t
-tuple_common_key_parts(const struct tuple *tuple_a,
-		       const struct tuple *tuple_b,
-		       const struct key_def *key_def);
+tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
+		       struct key_def *key_def);
 
 /**
  * Create a comparison function for the key_def
diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index 880abb6..e43f997 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -22,8 +22,7 @@ key_def_contains_sequential_parts(const struct key_def *def)
 template <bool has_optional_parts>
 static char *
 tuple_extract_key_sequential_raw(const char *data, const char *data_end,
-				 const struct key_def *key_def,
-				 uint32_t *key_size)
+				 struct key_def *key_def, uint32_t *key_size)
 {
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(key_def_is_sequential(key_def));
@@ -72,8 +71,7 @@ tuple_extract_key_sequential_raw(const char *data, const char *data_end,
  */
 template <bool has_optional_parts>
 static inline char *
-tuple_extract_key_sequential(const struct tuple *tuple,
-			     const struct key_def *key_def,
+tuple_extract_key_sequential(const struct tuple *tuple, struct key_def *key_def,
 			     uint32_t *key_size)
 {
 	assert(key_def_is_sequential(key_def));
@@ -94,7 +92,7 @@ tuple_extract_key_sequential(const struct tuple *tuple,
 template <bool contains_sequential_parts, bool has_optional_parts>
 static char *
 tuple_extract_key_slowpath(const struct tuple *tuple,
-			   const struct key_def *key_def, uint32_t *key_size)
+			   struct key_def *key_def, uint32_t *key_size)
 {
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
@@ -204,8 +202,7 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 template <bool has_optional_parts>
 static char *
 tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
-			       const struct key_def *key_def,
-			       uint32_t *key_size)
+			       struct key_def *key_def, uint32_t *key_size)
 {
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index dee9be3..45a786a 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -103,7 +103,7 @@ struct KeyFieldHash<TYPE> {
 
 template <int TYPE, int ...MORE_TYPES>
 struct KeyHash {
-	static uint32_t hash(const char *key, const struct key_def *)
+	static uint32_t hash(const char *key, struct key_def *)
 	{
 		uint32_t h = HASH_SEED;
 		uint32_t carry = 0;
@@ -116,7 +116,7 @@ struct KeyHash {
 
 template <>
 struct KeyHash<FIELD_TYPE_UNSIGNED> {
-	static uint32_t hash(const char *key, const struct key_def *key_def)
+	static uint32_t hash(const char *key, struct key_def *key_def)
 	{
 		uint64_t val = mp_decode_uint(&key);
 		(void) key_def;
@@ -152,7 +152,7 @@ template <int TYPE, int ...MORE_TYPES>
 struct TupleHash
 {
 	static uint32_t hash(const struct tuple *tuple,
-			     const struct key_def *key_def)
+			     struct key_def *key_def)
 	{
 		uint32_t h = HASH_SEED;
 		uint32_t carry = 0;
@@ -167,7 +167,7 @@ struct TupleHash
 template <>
 struct TupleHash<FIELD_TYPE_UNSIGNED> {
 	static uint32_t	hash(const struct tuple *tuple,
-			     const struct key_def *key_def)
+			     struct key_def *key_def)
 	{
 		const char *field = tuple_field(tuple, key_def->parts->fieldno);
 		uint64_t val = mp_decode_uint(&field);
@@ -213,10 +213,10 @@ static const hasher_signature hash_arr[] = {
 
 template <bool has_optional_parts>
 uint32_t
-tuple_hash_slowpath(const struct tuple *tuple, const struct key_def *key_def);
+tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def);
 
 uint32_t
-key_hash_slowpath(const char *key, const struct key_def *key_def);
+key_hash_slowpath(const char *key, struct key_def *key_def);
 
 void
 tuple_hash_func_set(struct key_def *key_def) {
@@ -308,9 +308,8 @@ tuple_hash_null(uint32_t *ph1, uint32_t *pcarry)
 }
 
 uint32_t
-tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry,
-		    const struct tuple *tuple,
-		    const struct key_part *part)
+tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry, const struct tuple *tuple,
+		    struct key_part *part)
 {
 	const char *field = tuple_field(tuple, part->fieldno);
 	if (field == NULL)
@@ -320,7 +319,7 @@ tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry,
 
 template <bool has_optional_parts>
 uint32_t
-tuple_hash_slowpath(const struct tuple *tuple, const struct key_def *key_def)
+tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 {
 	assert(has_optional_parts == key_def->has_optional_parts);
 	uint32_t h = HASH_SEED;
@@ -357,13 +356,13 @@ tuple_hash_slowpath(const struct tuple *tuple, const struct key_def *key_def)
 }
 
 uint32_t
-key_hash_slowpath(const char *key, const struct key_def *key_def)
+key_hash_slowpath(const char *key, struct key_def *key_def)
 {
 	uint32_t h = HASH_SEED;
 	uint32_t carry = 0;
 	uint32_t total_size = 0;
 
-	for (const struct key_part *part = key_def->parts;
+	for (struct key_part *part = key_def->parts;
 	     part < key_def->parts + key_def->part_count; part++) {
 		total_size += tuple_hash_field(&h, &carry, &key, part->coll);
 	}
diff --git a/src/box/tuple_hash.h b/src/box/tuple_hash.h
index aab8f54..abc961b 100644
--- a/src/box/tuple_hash.h
+++ b/src/box/tuple_hash.h
@@ -70,9 +70,8 @@ tuple_hash_field(uint32_t *ph1, uint32_t *pcarry, const char **field,
  * This function updates @ph1 and @pcarry.
  */
 uint32_t
-tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry,
-		    const struct tuple *tuple,
-		    const struct key_part *part);
+tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry, const struct tuple *tuple,
+		    struct key_part *part);
 
 /**
  * Calculates a common hash value for a tuple
@@ -81,7 +80,7 @@ tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry,
  * @return - hash value
  */
 static inline uint32_t
-tuple_hash(const struct tuple *tuple, const struct key_def *key_def)
+tuple_hash(const struct tuple *tuple, struct key_def *key_def)
 {
 	return key_def->tuple_hash(tuple, key_def);
 }
@@ -93,7 +92,7 @@ tuple_hash(const struct tuple *tuple, const struct key_def *key_def)
  * @return - hash value
  */
 static inline uint32_t
-key_hash(const char *key, const struct key_def *key_def)
+key_hash(const char *key, struct key_def *key_def)
 {
 	return key_def->key_hash(key, key_def);
 }
diff --git a/src/box/vy_history.c b/src/box/vy_history.c
index 498da97..0f3b711 100644
--- a/src/box/vy_history.c
+++ b/src/box/vy_history.c
@@ -73,7 +73,7 @@ vy_history_cleanup(struct vy_history *history)
 }
 
 int
-vy_history_apply(struct vy_history *history, const struct key_def *cmp_def,
+vy_history_apply(struct vy_history *history, struct key_def *cmp_def,
 		 struct tuple_format *format, bool keep_delete,
 		 int *upserts_applied, struct tuple **ret)
 {
diff --git a/src/box/vy_history.h b/src/box/vy_history.h
index 1f8bb59..e3c5a19 100644
--- a/src/box/vy_history.h
+++ b/src/box/vy_history.h
@@ -154,7 +154,7 @@ vy_history_cleanup(struct vy_history *history);
  * will return NULL unless @keep_delete flag is set.
  */
 int
-vy_history_apply(struct vy_history *history, const struct key_def *cmp_def,
+vy_history_apply(struct vy_history *history, struct key_def *cmp_def,
 		 struct tuple_format *format, bool keep_delete,
 		 int *upserts_applied, struct tuple **ret);
 
diff --git a/src/box/vy_mem.c b/src/box/vy_mem.c
index f9be850..ccd079f 100644
--- a/src/box/vy_mem.c
+++ b/src/box/vy_mem.c
@@ -97,7 +97,7 @@ vy_mem_tree_extent_free(void *ctx, void *p)
 
 struct vy_mem *
 vy_mem_new(struct vy_mem_env *env, int64_t generation,
-	   const struct key_def *cmp_def, struct tuple_format *format,
+	   struct key_def *cmp_def, struct tuple_format *format,
 	   struct tuple_format *format_with_colmask,
 	   uint32_t space_cache_version)
 {
@@ -321,7 +321,7 @@ vy_mem_iterator_find_lsn(struct vy_mem_iterator *itr,
 {
 	assert(!vy_mem_tree_iterator_is_invalid(&itr->curr_pos));
 	assert(itr->curr_stmt == vy_mem_iterator_curr_stmt(itr));
-	const struct key_def *cmp_def = itr->mem->cmp_def;
+	struct key_def *cmp_def = itr->mem->cmp_def;
 	while (vy_stmt_lsn(itr->curr_stmt) > (**itr->read_view).vlsn ||
 	       vy_stmt_flags(itr->curr_stmt) & VY_STMT_SKIP_READ) {
 		if (vy_mem_iterator_step(itr, iterator_type) != 0 ||
@@ -461,7 +461,7 @@ vy_mem_iterator_next_key(struct vy_mem_iterator *itr)
 	assert(itr->mem->version == itr->version);
 	assert(!vy_mem_tree_iterator_is_invalid(&itr->curr_pos));
 	assert(itr->curr_stmt == vy_mem_iterator_curr_stmt(itr));
-	const struct key_def *cmp_def = itr->mem->cmp_def;
+	struct key_def *cmp_def = itr->mem->cmp_def;
 
 	const struct tuple *prev_stmt = itr->curr_stmt;
 	do {
@@ -493,7 +493,7 @@ vy_mem_iterator_next_lsn(struct vy_mem_iterator *itr)
 	assert(itr->mem->version == itr->version);
 	assert(!vy_mem_tree_iterator_is_invalid(&itr->curr_pos));
 	assert(itr->curr_stmt == vy_mem_iterator_curr_stmt(itr));
-	const struct key_def *cmp_def = itr->mem->cmp_def;
+	struct key_def *cmp_def = itr->mem->cmp_def;
 
 	struct vy_mem_tree_iterator next_pos = itr->curr_pos;
 next:
diff --git a/src/box/vy_mem.h b/src/box/vy_mem.h
index 29b60ac..d6afeed 100644
--- a/src/box/vy_mem.h
+++ b/src/box/vy_mem.h
@@ -88,7 +88,7 @@ struct tree_mem_key {
  */
 static int
 vy_mem_tree_cmp(const struct tuple *a, const struct tuple *b,
-		const struct key_def *cmp_def)
+		struct key_def *cmp_def)
 {
 	int res = vy_tuple_compare(a, b, cmp_def);
 	if (res)
@@ -102,7 +102,7 @@ vy_mem_tree_cmp(const struct tuple *a, const struct tuple *b,
  */
 static int
 vy_mem_tree_cmp_key(const struct tuple *a, struct tree_mem_key *key,
-		    const struct key_def *cmp_def)
+		    struct key_def *cmp_def)
 {
 	int res = vy_stmt_compare(a, key->stmt, cmp_def);
 	if (res == 0) {
@@ -123,7 +123,7 @@ vy_mem_tree_cmp_key(const struct tuple *a, struct tree_mem_key *key,
 #define BPS_TREE_COMPARE_KEY(a, b, cmp_def) vy_mem_tree_cmp_key(a, b, cmp_def)
 #define bps_tree_elem_t const struct tuple *
 #define bps_tree_key_t struct tree_mem_key *
-#define bps_tree_arg_t const struct key_def *
+#define bps_tree_arg_t struct key_def *
 #define BPS_TREE_NO_DEBUG
 
 #include <salad/bps_tree.h>
@@ -183,7 +183,7 @@ struct vy_mem {
 	 * Key definition for this index, extended with primary
 	 * key parts.
 	 */
-	const struct key_def *cmp_def;
+	struct key_def *cmp_def;
 	/** version is initially 0 and is incremented on every write */
 	uint32_t version;
 	/** Data dictionary cache version at the time of creation. */
@@ -265,7 +265,7 @@ vy_mem_wait_pinned(struct vy_mem *mem)
  */
 struct vy_mem *
 vy_mem_new(struct vy_mem_env *env, int64_t generation,
-	   const struct key_def *cmp_def, struct tuple_format *format,
+	   struct key_def *cmp_def, struct tuple_format *format,
 	   struct tuple_format *format_with_colmask,
 	   uint32_t space_cache_version);
 
diff --git a/src/box/vy_range.c b/src/box/vy_range.c
index 4495ecd..0558098 100644
--- a/src/box/vy_range.c
+++ b/src/box/vy_range.c
@@ -178,7 +178,7 @@ vy_range_tree_find_by_key(vy_range_tree_t *tree,
 
 struct vy_range *
 vy_range_new(int64_t id, struct tuple *begin, struct tuple *end,
-	     const struct key_def *cmp_def)
+	     struct key_def *cmp_def)
 {
 	struct vy_range *range = calloc(1, sizeof(*range));
 	if (range == NULL) {
diff --git a/src/box/vy_range.h b/src/box/vy_range.h
index d963ed3..0830479 100644
--- a/src/box/vy_range.h
+++ b/src/box/vy_range.h
@@ -72,7 +72,7 @@ struct vy_range {
 	 * keys, to ensure an always distinct result for
 	 * non-unique keys.
 	 */
-	const struct key_def *cmp_def;
+	struct key_def *cmp_def;
 	/** An estimate of the number of statements in this range. */
 	struct vy_disk_stmt_counter count;
 	/**
@@ -197,7 +197,7 @@ vy_range_tree_find_by_key(vy_range_tree_t *tree,
  */
 struct vy_range *
 vy_range_new(int64_t id, struct tuple *begin, struct tuple *end,
-	     const struct key_def *cmp_def);
+	     struct key_def *cmp_def);
 
 /**
  * Free a range and all its slices.
diff --git a/src/box/vy_run.c b/src/box/vy_run.c
index f107e3a..7485d97 100644
--- a/src/box/vy_run.c
+++ b/src/box/vy_run.c
@@ -302,8 +302,8 @@ vy_run_bloom_size(struct vy_run *run)
  */
 static uint32_t
 vy_page_index_find_page(struct vy_run *run, const struct tuple *key,
-			const struct key_def *cmp_def,
-			enum iterator_type itype, bool *equal_key)
+			struct key_def *cmp_def, enum iterator_type itype,
+			bool *equal_key)
 {
 	if (itype == ITER_EQ)
 		itype = ITER_GE; /* One day it'll become obsolete */
@@ -365,9 +365,8 @@ vy_page_index_find_page(struct vy_run *run, const struct tuple *key,
 }
 
 struct vy_slice *
-vy_slice_new(int64_t id, struct vy_run *run,
-	     struct tuple *begin, struct tuple *end,
-	     const struct key_def *cmp_def)
+vy_slice_new(int64_t id, struct vy_run *run, struct tuple *begin,
+	     struct tuple *end, struct key_def *cmp_def)
 {
 	struct vy_slice *slice = malloc(sizeof(*slice));
 	if (slice == NULL) {
@@ -446,9 +445,8 @@ vy_slice_delete(struct vy_slice *slice)
 }
 
 int
-vy_slice_cut(struct vy_slice *slice, int64_t id,
-	     struct tuple *begin, struct tuple *end,
-	     const struct key_def *cmp_def,
+vy_slice_cut(struct vy_slice *slice, int64_t id, struct tuple *begin,
+	     struct tuple *end, struct key_def *cmp_def,
 	     struct vy_slice **result)
 {
 	*result = NULL;
@@ -1148,7 +1146,7 @@ vy_run_iterator_find_lsn(struct vy_run_iterator *itr,
 			 const struct tuple *key, struct tuple **ret)
 {
 	struct vy_slice *slice = itr->slice;
-	const struct key_def *cmp_def = itr->cmp_def;
+	struct key_def *cmp_def = itr->cmp_def;
 
 	*ret = NULL;
 
@@ -1228,7 +1226,7 @@ vy_run_iterator_do_seek(struct vy_run_iterator *itr,
 	*ret = NULL;
 
 	struct tuple_bloom *bloom = run->info.bloom;
-	const struct key_def *key_def = itr->key_def;
+	struct key_def *key_def = itr->key_def;
 	if (iterator_type == ITER_EQ && bloom != NULL) {
 		bool need_lookup;
 		if (vy_stmt_type(key) == IPROTO_SELECT) {
@@ -1318,7 +1316,7 @@ vy_run_iterator_seek(struct vy_run_iterator *itr,
 		     enum iterator_type iterator_type,
 		     const struct tuple *key, struct tuple **ret)
 {
-	const struct key_def *cmp_def = itr->cmp_def;
+	struct key_def *cmp_def = itr->cmp_def;
 	struct vy_slice *slice = itr->slice;
 	const struct tuple *check_eq_key = NULL;
 	int cmp;
@@ -1392,8 +1390,7 @@ vy_run_iterator_open(struct vy_run_iterator *itr,
 		     struct vy_run_iterator_stat *stat,
 		     struct vy_slice *slice, enum iterator_type iterator_type,
 		     const struct tuple *key, const struct vy_read_view **rv,
-		     const struct key_def *cmp_def,
-		     const struct key_def *key_def,
+		     struct key_def *cmp_def, struct key_def *key_def,
 		     struct tuple_format *format,
 		     bool is_primary)
 {
@@ -1729,7 +1726,7 @@ fail:
 /* dump statement to the run page buffers (stmt header and data) */
 static int
 vy_run_dump_stmt(const struct tuple *value, struct xlog *data_xlog,
-		 struct vy_page_info *info, const struct key_def *key_def,
+		 struct vy_page_info *info, struct key_def *key_def,
 		 bool is_primary)
 {
 	struct xrow_header xrow;
@@ -2019,9 +2016,9 @@ fail:
 
 int
 vy_run_writer_create(struct vy_run_writer *writer, struct vy_run *run,
-		const char *dirpath, uint32_t space_id, uint32_t iid,
-		const struct key_def *cmp_def, const struct key_def *key_def,
-		uint64_t page_size, double bloom_fpr)
+		     const char *dirpath, uint32_t space_id, uint32_t iid,
+		     struct key_def *cmp_def, struct key_def *key_def,
+		     uint64_t page_size, double bloom_fpr)
 {
 	memset(writer, 0, sizeof(*writer));
 	writer->run = run;
@@ -2285,10 +2282,8 @@ vy_run_writer_abort(struct vy_run_writer *writer)
 int
 vy_run_rebuild_index(struct vy_run *run, const char *dir,
 		     uint32_t space_id, uint32_t iid,
-		     const struct key_def *cmp_def,
-		     const struct key_def *key_def,
-		     struct tuple_format *format,
-		     const struct index_opts *opts)
+		     struct key_def *cmp_def, struct key_def *key_def,
+		     struct tuple_format *format, const struct index_opts *opts)
 {
 	assert(run->info.bloom == NULL);
 	assert(run->page_info == NULL);
@@ -2628,7 +2623,7 @@ static const struct vy_stmt_stream_iface vy_slice_stream_iface = {
 
 void
 vy_slice_stream_open(struct vy_slice_stream *stream, struct vy_slice *slice,
-		     const struct key_def *cmp_def, struct tuple_format *format,
+		     struct key_def *cmp_def, struct tuple_format *format,
 		     bool is_primary)
 {
 	stream->base.iface = &vy_slice_stream_iface;
diff --git a/src/box/vy_run.h b/src/box/vy_run.h
index 5030886..d74f216 100644
--- a/src/box/vy_run.h
+++ b/src/box/vy_run.h
@@ -218,9 +218,9 @@ struct vy_run_iterator {
 
 	/* Members needed for memory allocation and disk access */
 	/** Key definition used for comparing statements on disk. */
-	const struct key_def *cmp_def;
+	struct key_def *cmp_def;
 	/** Key definition provided by the user. */
-	const struct key_def *key_def;
+	struct key_def *key_def;
 	/**
 	 * Format ot allocate REPLACE and DELETE tuples read from
 	 * pages.
@@ -370,8 +370,7 @@ vy_run_recover(struct vy_run *run, const char *dir,
 int
 vy_run_rebuild_index(struct vy_run *run, const char *dir,
 		     uint32_t space_id, uint32_t iid,
-		     const struct key_def *cmp_def,
-		     const struct key_def *key_def,
+		     struct key_def *cmp_def, struct key_def *key_def,
 		     struct tuple_format *format,
 		     const struct index_opts *opts);
 
@@ -428,9 +427,8 @@ vy_run_remove_files(const char *dir, uint32_t space_id,
  * This function increments @run->refs.
  */
 struct vy_slice *
-vy_slice_new(int64_t id, struct vy_run *run,
-	     struct tuple *begin, struct tuple *end,
-	     const struct key_def *cmp_def);
+vy_slice_new(int64_t id, struct vy_run *run, struct tuple *begin,
+	     struct tuple *end, struct key_def *cmp_def);
 
 /**
  * Free a run slice.
@@ -480,9 +478,8 @@ vy_slice_wait_pinned(struct vy_slice *slice)
  * with [@begin, @end), @result is set to NULL.
  */
 int
-vy_slice_cut(struct vy_slice *slice, int64_t id,
-	     struct tuple *begin, struct tuple *end,
-	     const struct key_def *cmp_def,
+vy_slice_cut(struct vy_slice *slice, int64_t id, struct tuple *begin,
+	     struct tuple *end, struct key_def *cmp_def,
 	     struct vy_slice **result);
 
 /**
@@ -496,8 +493,7 @@ vy_run_iterator_open(struct vy_run_iterator *itr,
 		     struct vy_run_iterator_stat *stat,
 		     struct vy_slice *slice, enum iterator_type iterator_type,
 		     const struct tuple *key, const struct vy_read_view **rv,
-		     const struct key_def *cmp_def,
-		     const struct key_def *key_def,
+		     struct key_def *cmp_def, struct key_def *key_def,
 		     struct tuple_format *format, bool is_primary);
 
 /**
@@ -547,7 +543,7 @@ struct vy_slice_stream {
 	 * Key def for comparing with slice boundaries,
 	 * includes secondary key parts.
 	 */
-	const struct key_def *cmp_def;
+	struct key_def *cmp_def;
 	/** Format for allocating REPLACE and DELETE tuples read from pages. */
 	struct tuple_format *format;
 	/** Set if this iterator is for a primary index. */
@@ -559,7 +555,7 @@ struct vy_slice_stream {
  */
 void
 vy_slice_stream_open(struct vy_slice_stream *stream, struct vy_slice *slice,
-		     const struct key_def *cmp_def, struct tuple_format *format,
+		     struct key_def *cmp_def, struct tuple_format *format,
 		     bool is_primary);
 
 /**
@@ -580,9 +576,9 @@ struct vy_run_writer {
 	 * min key, run min/max keys, and secondary index
 	 * statements.
 	 */
-	const struct key_def *cmp_def;
+	struct key_def *cmp_def;
 	/** Key definition to calculate bloom. */
-	const struct key_def *key_def;
+	struct key_def *key_def;
 	/**
 	 * Minimal page size. When a page becames bigger, it is
 	 * dumped.
@@ -610,9 +606,9 @@ struct vy_run_writer {
 /** Create a run writer to fill a run with statements. */
 int
 vy_run_writer_create(struct vy_run_writer *writer, struct vy_run *run,
-		const char *dirpath, uint32_t space_id, uint32_t iid,
-		const struct key_def *cmp_def, const struct key_def *key_def,
-		uint64_t page_size, double bloom_fpr);
+		     const char *dirpath, uint32_t space_id, uint32_t iid,
+		     struct key_def *cmp_def, struct key_def *key_def,
+		     uint64_t page_size, double bloom_fpr);
 
 /**
  * Write a specified statement into a run.
diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c
index 37da282..8018dee 100644
--- a/src/box/vy_stmt.c
+++ b/src/box/vy_stmt.c
@@ -387,8 +387,7 @@ vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
 }
 
 struct tuple *
-vy_stmt_new_surrogate_delete_from_key(const char *key,
-				      const struct key_def *cmp_def,
+vy_stmt_new_surrogate_delete_from_key(const char *key, struct key_def *cmp_def,
 				      struct tuple_format *format)
 {
 	return vy_stmt_new_surrogate_from_key(key, IPROTO_DELETE,
@@ -457,7 +456,7 @@ vy_stmt_new_surrogate_delete_raw(struct tuple_format *format,
 }
 
 struct tuple *
-vy_stmt_extract_key(const struct tuple *stmt, const struct key_def *key_def,
+vy_stmt_extract_key(const struct tuple *stmt, struct key_def *key_def,
 		    struct tuple_format *format)
 {
 	struct region *region = &fiber()->gc;
@@ -475,7 +474,7 @@ vy_stmt_extract_key(const struct tuple *stmt, const struct key_def *key_def,
 
 struct tuple *
 vy_stmt_extract_key_raw(const char *data, const char *data_end,
-			const struct key_def *key_def,
+			struct key_def *key_def,
 			struct tuple_format *format)
 {
 	struct region *region = &fiber()->gc;
@@ -543,9 +542,8 @@ vy_stmt_meta_decode(struct request *request, struct tuple *stmt)
 }
 
 int
-vy_stmt_encode_primary(const struct tuple *value,
-		       const struct key_def *key_def, uint32_t space_id,
-		       struct xrow_header *xrow)
+vy_stmt_encode_primary(const struct tuple *value, struct key_def *key_def,
+		       uint32_t space_id, struct xrow_header *xrow)
 {
 	memset(xrow, 0, sizeof(*xrow));
 	enum iproto_type type = vy_stmt_type(value);
@@ -591,8 +589,7 @@ vy_stmt_encode_primary(const struct tuple *value,
 }
 
 int
-vy_stmt_encode_secondary(const struct tuple *value,
-			 const struct key_def *cmp_def,
+vy_stmt_encode_secondary(const struct tuple *value, struct key_def *cmp_def,
 			 struct xrow_header *xrow)
 {
 	memset(xrow, 0, sizeof(*xrow));
diff --git a/src/box/vy_stmt.h b/src/box/vy_stmt.h
index 273d5e8..c2a5621 100644
--- a/src/box/vy_stmt.h
+++ b/src/box/vy_stmt.h
@@ -331,7 +331,7 @@ vy_stmt_unref_if_possible(struct tuple *stmt)
  */
 static inline int
 vy_key_compare(const struct tuple *a, const struct tuple *b,
-	       const struct key_def *cmp_def)
+	       struct key_def *cmp_def)
 {
 	assert(vy_stmt_type(a) == IPROTO_SELECT);
 	assert(vy_stmt_type(b) == IPROTO_SELECT);
@@ -352,7 +352,7 @@ vy_key_compare(const struct tuple *a, const struct tuple *b,
  */
 static inline int
 vy_tuple_compare(const struct tuple *a, const struct tuple *b,
-		 const struct key_def *cmp_def)
+		 struct key_def *cmp_def)
 {
 	enum iproto_type type;
 	type = vy_stmt_type(a);
@@ -381,7 +381,7 @@ vy_tuple_compare(const struct tuple *a, const struct tuple *b,
  */
 static inline int
 vy_tuple_compare_with_raw_key(const struct tuple *tuple, const char *key,
-			      const struct key_def *key_def)
+			      struct key_def *key_def)
 {
 	uint32_t part_count = mp_decode_array(&key);
 	return tuple_compare_with_key(tuple, key, part_count, key_def);
@@ -390,7 +390,7 @@ vy_tuple_compare_with_raw_key(const struct tuple *tuple, const char *key,
 /** @sa vy_tuple_compare_with_raw_key(). */
 static inline int
 vy_tuple_compare_with_key(const struct tuple *tuple, const struct tuple *key,
-			  const struct key_def *key_def)
+			  struct key_def *key_def)
 {
 	const char *key_mp = tuple_data(key);
 	uint32_t part_count = mp_decode_array(&key_mp);
@@ -400,7 +400,7 @@ vy_tuple_compare_with_key(const struct tuple *tuple, const struct tuple *key,
 /** @sa tuple_compare. */
 static inline int
 vy_stmt_compare(const struct tuple *a, const struct tuple *b,
-		const struct key_def *key_def)
+		struct key_def *key_def)
 {
 	bool a_is_tuple = vy_stmt_type(a) != IPROTO_SELECT;
 	bool b_is_tuple = vy_stmt_type(b) != IPROTO_SELECT;
@@ -419,7 +419,7 @@ vy_stmt_compare(const struct tuple *a, const struct tuple *b,
 /** @sa tuple_compare_with_raw_key. */
 static inline int
 vy_stmt_compare_with_raw_key(const struct tuple *stmt, const char *key,
-			     const struct key_def *key_def)
+			     struct key_def *key_def)
 {
 	if (vy_stmt_type(stmt) != IPROTO_SELECT)
 		return vy_tuple_compare_with_raw_key(stmt, key, key_def);
@@ -429,7 +429,7 @@ vy_stmt_compare_with_raw_key(const struct tuple *stmt, const char *key,
 /** @sa tuple_compare_with_key. */
 static inline int
 vy_stmt_compare_with_key(const struct tuple *stmt, const struct tuple *key,
-			 const struct key_def *key_def)
+			 struct key_def *key_def)
 {
 	assert(vy_stmt_type(key) == IPROTO_SELECT);
 	return vy_stmt_compare_with_raw_key(stmt, tuple_data(key), key_def);
@@ -476,7 +476,7 @@ vy_key_dup(const char *key);
  */
 struct tuple *
 vy_stmt_new_surrogate_delete_from_key(const char *key,
-				      const struct key_def *cmp_def,
+				      struct key_def *cmp_def,
 				      struct tuple_format *format);
 
 /**
@@ -628,7 +628,7 @@ vy_key_from_msgpack(struct tuple_format *format, const char *key)
  * malloc().
  */
 struct tuple *
-vy_stmt_extract_key(const struct tuple *stmt, const struct key_def *key_def,
+vy_stmt_extract_key(const struct tuple *stmt, struct key_def *key_def,
 		    struct tuple_format *format);
 
 /**
@@ -638,7 +638,7 @@ vy_stmt_extract_key(const struct tuple *stmt, const struct key_def *key_def,
  */
 struct tuple *
 vy_stmt_extract_key_raw(const char *data, const char *data_end,
-			const struct key_def *key_def,
+			struct key_def *key_def,
 			struct tuple_format *format);
 
 /**
@@ -654,9 +654,8 @@ vy_stmt_extract_key_raw(const char *data, const char *data_end,
  * @retval -1 if error
  */
 int
-vy_stmt_encode_primary(const struct tuple *value,
-		       const struct key_def *key_def, uint32_t space_id,
-		       struct xrow_header *xrow);
+vy_stmt_encode_primary(const struct tuple *value, struct key_def *key_def,
+		       uint32_t space_id, struct xrow_header *xrow);
 
 /**
  * Encode vy_stmt for a secondary key as xrow_header
@@ -669,8 +668,7 @@ vy_stmt_encode_primary(const struct tuple *value,
  * @retval -1 if error
  */
 int
-vy_stmt_encode_secondary(const struct tuple *value,
-			 const struct key_def *cmp_def,
+vy_stmt_encode_secondary(const struct tuple *value, struct key_def *cmp_def,
 			 struct xrow_header *xrow);
 
 /**
@@ -716,7 +714,7 @@ vy_tuple_format_new_with_colmask(struct tuple_format *mem_format);
  * @retval Does the key contain NULL or not?
  */
 static inline bool
-vy_tuple_key_contains_null(const struct tuple *tuple, const struct key_def *def)
+vy_tuple_key_contains_null(const struct tuple *tuple, struct key_def *def)
 {
 	for (uint32_t i = 0; i < def->part_count; ++i) {
 		const char *field = tuple_field(tuple, def->parts[i].fieldno);
diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
index 7af58d9..ebea278 100644
--- a/src/box/vy_upsert.c
+++ b/src/box/vy_upsert.c
@@ -88,7 +88,7 @@ vy_upsert_try_to_squash(struct tuple_format *format, struct region *region,
 
 struct tuple *
 vy_apply_upsert(const struct tuple *new_stmt, const struct tuple *old_stmt,
-		const struct key_def *cmp_def, struct tuple_format *format,
+		struct key_def *cmp_def, struct tuple_format *format,
 		bool suppress_error)
 {
 	/*
diff --git a/src/box/vy_upsert.h b/src/box/vy_upsert.h
index 7878b1b..5649961 100644
--- a/src/box/vy_upsert.h
+++ b/src/box/vy_upsert.h
@@ -65,7 +65,7 @@ struct tuple_format;
  */
 struct tuple *
 vy_apply_upsert(const struct tuple *new_stmt, const struct tuple *old_stmt,
-		const struct key_def *cmp_def, struct tuple_format *format,
+		struct key_def *cmp_def, struct tuple_format *format,
 		bool suppress_error);
 
 #if defined(__cplusplus)
diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
index df9c933..1b54e53 100644
--- a/src/box/vy_write_iterator.c
+++ b/src/box/vy_write_iterator.c
@@ -166,7 +166,7 @@ struct vy_write_iterator {
 	/* A heap to order the sources, newest LSN at heap top. */
 	heap_t src_heap;
 	/** Index key definition used to store statements on disk. */
-	const struct key_def *cmp_def;
+	struct key_def *cmp_def;
 	/** Format to allocate new REPLACE and DELETE tuples from vy_run */
 	struct tuple_format *format;
 	/* There is no LSM tree level older than the one we're writing to. */
@@ -340,9 +340,9 @@ static const struct vy_stmt_stream_iface vy_slice_stream_iface;
  * @return the iterator or NULL on error (diag is set).
  */
 struct vy_stmt_stream *
-vy_write_iterator_new(const struct key_def *cmp_def,
-		      struct tuple_format *format, bool is_primary,
-		      bool is_last_level, struct rlist *read_views,
+vy_write_iterator_new(struct key_def *cmp_def, struct tuple_format *format,
+		      bool is_primary, bool is_last_level,
+		      struct rlist *read_views,
 		      struct vy_deferred_delete_handler *handler)
 {
 	/*
diff --git a/src/box/vy_write_iterator.h b/src/box/vy_write_iterator.h
index 5214b60..3b9b535 100644
--- a/src/box/vy_write_iterator.h
+++ b/src/box/vy_write_iterator.h
@@ -269,9 +269,9 @@ struct vy_deferred_delete_handler {
  * @return the iterator or NULL on error (diag is set).
  */
 struct vy_stmt_stream *
-vy_write_iterator_new(const struct key_def *cmp_def,
-		      struct tuple_format *format, bool is_primary,
-		      bool is_last_level, struct rlist *read_views,
+vy_write_iterator_new(struct key_def *cmp_def, struct tuple_format *format,
+		      bool is_primary, bool is_last_level,
+		      struct rlist *read_views,
 		      struct vy_deferred_delete_handler *handler);
 
 /**
-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3.1 2/5] box: introduce tuple_field_by_part routine
  2018-09-20  7:46 [PATCH v3.1 0/5] box: indexes by JSON path Kirill Shcherbatov
  2018-09-20  7:46 ` [PATCH v3.1 1/5] box: refactor API to use non-constant key_def Kirill Shcherbatov
@ 2018-09-20  7:46 ` Kirill Shcherbatov
  2018-09-21  9:33   ` Vladimir Davydov
  2018-09-20  7:46 ` [PATCH v3.1 3/5] box: introduce JSON indexes Kirill Shcherbatov
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Kirill Shcherbatov @ 2018-09-20  7:46 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, Kirill Shcherbatov

Start use tuple_field_by_part(_raw) routine in *extract,
*compare, *hash functions. This new function use key_part to
retrieve field data mentioned in key_part. Now it is just a
wrapper for tuple_field_raw but with introducing JSON paths
it would work in other way.

Part of #1012.
---
 src/box/memtx_bitset.c       |  5 +++--
 src/box/memtx_rtree.c        |  3 ++-
 src/box/tuple.h              | 14 ++++++++++++++
 src/box/tuple_compare.cc     | 44 +++++++++++++++++++++++++++-----------------
 src/box/tuple_extract_key.cc |  8 ++++----
 src/box/tuple_format.c       |  7 +++++++
 src/box/tuple_format.h       | 12 ++++++++++++
 src/box/tuple_hash.cc        | 19 ++++++++++++++-----
 src/box/vy_stmt.h            |  9 +++++++--
 9 files changed, 90 insertions(+), 31 deletions(-)

diff --git a/src/box/memtx_bitset.c b/src/box/memtx_bitset.c
index a665f1a..cd7362e 100644
--- a/src/box/memtx_bitset.c
+++ b/src/box/memtx_bitset.c
@@ -283,8 +283,9 @@ memtx_bitset_index_replace(struct index *base, struct tuple *old_tuple,
 	}
 
 	if (new_tuple != NULL) {
-		const char *field;
-		field = tuple_field(new_tuple, base->def->key_def->parts[0].fieldno);
+		const char *field =
+			tuple_field_by_part(new_tuple,
+					    base->def->key_def->parts);
 		uint32_t key_len;
 		const void *key = make_key(field, &key_len);
 #ifndef OLD_GOOD_BITSET
diff --git a/src/box/memtx_rtree.c b/src/box/memtx_rtree.c
index 0b12cda..f2aa6c3 100644
--- a/src/box/memtx_rtree.c
+++ b/src/box/memtx_rtree.c
@@ -112,7 +112,8 @@ extract_rectangle(struct rtree_rect *rect, const struct tuple *tuple,
 		  struct index_def *index_def)
 {
 	assert(index_def->key_def->part_count == 1);
-	const char *elems = tuple_field(tuple, index_def->key_def->parts[0].fieldno);
+	const char *elems = tuple_field_by_part(tuple,
+						index_def->key_def->parts);
 	unsigned dimension = index_def->opts.dimension;
 	uint32_t count = mp_decode_array(&elems);
 	return mp_decode_rect(rect, dimension, elems, count, "Field");
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 2e84516..b638f50 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -43,6 +43,7 @@ extern "C" {
 
 struct slab_arena;
 struct quota;
+struct key_part;
 
 /**
  * A format for standalone tuples allocated on runtime arena.
@@ -522,6 +523,19 @@ tuple_field(const struct tuple *tuple, uint32_t fieldno)
 }
 
 /**
+ * Get a field refereed by index @part in tuple.
+ * @param tuple Tuple to get the field from.
+ * @param part Index part to use.
+ * @retval Field data if the field exists or NULL.
+ */
+static inline const char *
+tuple_field_by_part(const struct tuple *tuple, struct key_part *part)
+{
+	return tuple_field_by_part_raw(tuple_format(tuple), tuple_data(tuple),
+				       tuple_field_map(tuple), part);
+}
+
+/**
  * Get tuple field by its JSON path.
  * @param tuple Tuple to get field from.
  * @param path Field JSON path.
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index f9b4597..e21b009 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -430,10 +430,20 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		       struct key_def *key_def)
 {
 	uint32_t i;
+	struct tuple_format *tuple_a_format = tuple_format(tuple_a);
+	struct tuple_format *tuple_b_format = tuple_format(tuple_b);
+	const char *tuple_a_raw = tuple_data(tuple_a);
+	const char *tuple_b_raw = tuple_data(tuple_b);
+	const uint32_t *tuple_a_field_map = tuple_field_map(tuple_a);
+	const uint32_t *tuple_b_field_map = tuple_field_map(tuple_b);
 	for (i = 0; i < key_def->part_count; i++) {
-		const struct key_part *part = &key_def->parts[i];
-		const char *field_a = tuple_field(tuple_a, part->fieldno);
-		const char *field_b = tuple_field(tuple_b, part->fieldno);
+		struct key_part *part = (struct key_part *)&key_def->parts[i];
+		const char *field_a =
+			tuple_field_by_part_raw(tuple_a_format, tuple_a_raw,
+						tuple_a_field_map, part);
+		const char *field_b =
+			tuple_field_by_part_raw(tuple_b_format, tuple_b_raw,
+						tuple_b_field_map, part);
 		enum mp_type a_type = field_a != NULL ?
 				      mp_typeof(*field_a) : MP_NIL;
 		enum mp_type b_type = field_b != NULL ?
@@ -497,10 +507,10 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		end = part + key_def->part_count;
 
 	for (; part < end; part++) {
-		field_a = tuple_field_raw(format_a, tuple_a_raw, field_map_a,
-					  part->fieldno);
-		field_b = tuple_field_raw(format_b, tuple_b_raw, field_map_b,
-					  part->fieldno);
+		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+						  field_map_a, part);
+		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+						  field_map_b, part);
 		assert(has_optional_parts ||
 		       (field_a != NULL && field_b != NULL));
 		if (! is_nullable) {
@@ -547,10 +557,10 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	 */
 	end = key_def->parts + key_def->part_count;
 	for (; part < end; ++part) {
-		field_a = tuple_field_raw(format_a, tuple_a_raw, field_map_a,
-					  part->fieldno);
-		field_b = tuple_field_raw(format_b, tuple_b_raw, field_map_b,
-					  part->fieldno);
+		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+						  field_map_a, part);
+		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+						  field_map_b, part);
 		/*
 		 * Extended parts are primary, and they can not
 		 * be absent or be NULLs.
@@ -580,9 +590,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	const uint32_t *field_map = tuple_field_map(tuple);
 	enum mp_type a_type, b_type;
 	if (likely(part_count == 1)) {
-		const char *field;
-		field = tuple_field_raw(format, tuple_raw, field_map,
-					part->fieldno);
+		const char *field =
+			tuple_field_by_part_raw(format, tuple_raw, field_map,
+						part);
 		if (! is_nullable) {
 			return tuple_compare_field(field, key, part->type,
 						   part->coll);
@@ -606,9 +616,9 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	struct key_part *end = part + part_count;
 	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
-		const char *field;
-		field = tuple_field_raw(format, tuple_raw, field_map,
-					part->fieldno);
+		const char *field =
+			tuple_field_by_part_raw(format, tuple_raw,
+						field_map, part);
 		if (! is_nullable) {
 			rc = tuple_compare_field(field, key, part->type,
 						 part->coll);
diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index e43f997..e493c3b 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -109,8 +109,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 	/* Calculate the key size. */
 	for (uint32_t i = 0; i < part_count; ++i) {
 		const char *field =
-			tuple_field_raw(format, data, field_map,
-					key_def->parts[i].fieldno);
+			tuple_field_by_part_raw(format, data, field_map,
+						&key_def->parts[i]);
 		if (has_optional_parts && field == NULL) {
 			bsize += mp_sizeof_nil();
 			continue;
@@ -151,8 +151,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 	char *key_buf = mp_encode_array(key, part_count);
 	for (uint32_t i = 0; i < part_count; ++i) {
 		const char *field =
-			tuple_field_raw(format, data, field_map,
-					key_def->parts[i].fieldno);
+			tuple_field_by_part_raw(format, data, field_map,
+						&key_def->parts[i]);
 		if (has_optional_parts && field == NULL) {
 			key_buf = mp_encode_nil(key_buf);
 			continue;
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index b385c0d..91677d4 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -541,6 +541,13 @@ tuple_field_go_to_key(const char **field, const char *key, int len)
 	return -1;
 }
 
+const char *
+tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
+			const uint32_t *field_map, struct key_part *part)
+{
+	return tuple_field_raw(format, data, field_map, part->fieldno);
+}
+
 int
 tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
                         const uint32_t *field_map, const char *path,
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index c7dc48f..0ffdf7d 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -324,6 +324,18 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
 		     const char *tuple);
 
 /**
+ * Get a field refereed by index @part in tuple.
+ * @param format Tuple format.
+ * @param tuple A pointer to MessagePack array.
+ * @param field_map A pointer to the LAST element of field map.
+ * @param part Index part to use.
+ * @retval Field data if the field exists or NULL.
+ */
+const char *
+tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
+			const uint32_t *field_map, struct key_part *part);
+
+/**
  * Get a field at the specific position in this MessagePack array.
  * Returns a pointer to MessagePack data.
  * @param format tuple format
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index 45a786a..b394804 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -157,7 +157,8 @@ struct TupleHash
 		uint32_t h = HASH_SEED;
 		uint32_t carry = 0;
 		uint32_t total_size = 0;
-		const char *field = tuple_field(tuple, key_def->parts->fieldno);
+		const char *field =
+			tuple_field_by_part(tuple, key_def->parts);
 		TupleFieldHash<TYPE, MORE_TYPES...>::
 			hash(&field, &h, &carry, &total_size);
 		return PMurHash32_Result(h, carry, total_size);
@@ -169,7 +170,8 @@ struct TupleHash<FIELD_TYPE_UNSIGNED> {
 	static uint32_t	hash(const struct tuple *tuple,
 			     struct key_def *key_def)
 	{
-		const char *field = tuple_field(tuple, key_def->parts->fieldno);
+		const char *field =
+			tuple_field_by_part(tuple, key_def->parts);
 		uint64_t val = mp_decode_uint(&field);
 		if (likely(val <= UINT32_MAX))
 			return val;
@@ -311,7 +313,7 @@ uint32_t
 tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry, const struct tuple *tuple,
 		    struct key_part *part)
 {
-	const char *field = tuple_field(tuple, part->fieldno);
+	const char *field = tuple_field_by_part(tuple, part);
 	if (field == NULL)
 		return tuple_hash_null(ph1, pcarry);
 	return tuple_hash_field(ph1, pcarry, &field, part->coll);
@@ -326,7 +328,12 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 	uint32_t carry = 0;
 	uint32_t total_size = 0;
 	uint32_t prev_fieldno = key_def->parts[0].fieldno;
-	const char *field = tuple_field(tuple, key_def->parts[0].fieldno);
+	struct tuple_format *format = tuple_format(tuple);
+	const char *tuple_raw = tuple_data(tuple);
+	const uint32_t *field_map = tuple_field_map(tuple);
+	const char *field =
+		tuple_field_by_part_raw(format, tuple_raw, field_map,
+					key_def->parts);
 	const char *end = (char *)tuple + tuple_size(tuple);
 	if (has_optional_parts && field == NULL) {
 		total_size += tuple_hash_null(&h, &carry);
@@ -340,7 +347,9 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 		 * need of tuple_field
 		 */
 		if (prev_fieldno + 1 != key_def->parts[part_id].fieldno) {
-			field = tuple_field(tuple, key_def->parts[part_id].fieldno);
+			struct key_part *part = &key_def->parts[part_id];
+			field = tuple_field_by_part_raw(format, tuple_raw,
+							field_map, part);
 		}
 		if (has_optional_parts && (field == NULL || field >= end)) {
 			total_size += tuple_hash_null(&h, &carry);
diff --git a/src/box/vy_stmt.h b/src/box/vy_stmt.h
index c2a5621..b52b4e2 100644
--- a/src/box/vy_stmt.h
+++ b/src/box/vy_stmt.h
@@ -716,8 +716,13 @@ vy_tuple_format_new_with_colmask(struct tuple_format *mem_format);
 static inline bool
 vy_tuple_key_contains_null(const struct tuple *tuple, struct key_def *def)
 {
-	for (uint32_t i = 0; i < def->part_count; ++i) {
-		const char *field = tuple_field(tuple, def->parts[i].fieldno);
+	struct tuple_format *format = tuple_format(tuple);
+	const char *data = tuple_data(tuple);
+	const uint32_t *field_map = tuple_field_map(tuple);
+	for (struct key_part *part = def->parts, *end = part + def->part_count;
+	     part < end; ++part) {
+		const char *field =
+			tuple_field_by_part_raw(format, data, field_map, part);
 		if (field == NULL || mp_typeof(*field) == MP_NIL)
 			return true;
 	}
-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3.1 3/5] box: introduce JSON indexes
  2018-09-20  7:46 [PATCH v3.1 0/5] box: indexes by JSON path Kirill Shcherbatov
  2018-09-20  7:46 ` [PATCH v3.1 1/5] box: refactor API to use non-constant key_def Kirill Shcherbatov
  2018-09-20  7:46 ` [PATCH v3.1 2/5] box: introduce tuple_field_by_part routine Kirill Shcherbatov
@ 2018-09-20  7:46 ` Kirill Shcherbatov
  2018-09-22 19:15   ` Vladimir Davydov
  2018-09-20  7:46 ` [PATCH v3.1 4/5] box: introduce offset slot cache in key_part Kirill Shcherbatov
  2018-09-20  7:46 ` [PATCH v3.1 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
  4 siblings, 1 reply; 9+ messages in thread
From: Kirill Shcherbatov @ 2018-09-20  7:46 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, Kirill Shcherbatov

As we need to store user-defined JSON path in key_part
and key_part_def, we have introduced path and path_len
fields. JSON path is verified and transformed to canonical
form on index msgpack unpack.
Path string stored as a part of the key_def allocation:

+-------+---------+-------+---------+-------+-------+-------+
|key_def|key_part1|  ...  |key_partN| path1 | pathK | pathN |
+-------+---------+-------+---------+-------+-------+-------+
          |                         ^
          |-> path _________________|

Because of field names specified as format could be changed
key_part path persisted in Tarantool should be always started
with first-level field access via array index(not by name).

To work with JSON-defined indexes we use format JSON
path hashtable data_path and a tree of intermediate path
fields attached to format root fields.

<Hashtable>: format->data_path
[2].FIO.fname -> field "fname" {type=str, off_slot=-1}
[2].FIO.sname -> field "sname" {type=str, off_slot=-2}

<Tree>:      format->field[2] {type = map}
                           |
                   FIO {type = map}
                           |
        "fname"            |             "sname"
{type=str,off_slot=-1} ____|____ {type = str,off_slot=-2}

Leaf fields used in Index have initialized offset_slot.
On new tuple creation we observe fields tree and use leaf
records to init tuple field_map.
At the same time we use data_path hashtable on tuple data
access by index(when cached offset_slot is invalid).
All paths stored at the end of format allocation:
JSON-tree fields same as format->path_hash point to them.

+------------+------------+-------+------------+-------+
|tuple_format|tuple_field1|  ...  |tuple_fieldN| pathK |
+------------+------------+-------+------------+-------+

New routine tuple_format_add_json_path is used to construct
all internal structures for JSON path on new format creation
and duplicating.
Paths are deduplicated in format allocation.

Part of #1012.
---
 src/box/errcode.h            |   2 +-
 src/box/index_def.c          |  10 +-
 src/box/key_def.c            | 300 +++++++++++++--
 src/box/key_def.h            |  38 +-
 src/box/lua/space.cc         |   5 +
 src/box/memtx_engine.c       |   5 +
 src/box/schema.cc            |  12 +-
 src/box/tuple.c              |  12 +-
 src/box/tuple_compare.cc     | 119 ++++--
 src/box/tuple_extract_key.cc | 136 +++++--
 src/box/tuple_format.c       | 873 ++++++++++++++++++++++++++++++++++++-------
 src/box/tuple_format.h       |  63 +++-
 src/box/tuple_hash.cc        |  47 ++-
 src/box/vinyl.c              |   5 +
 src/box/vy_log.c             |   3 +-
 src/box/vy_lsm.c             |   1 +
 src/box/vy_point_lookup.c    |   2 -
 src/box/vy_stmt.c            | 148 ++++++--
 test/box/misc.result         |  57 +--
 test/engine/tuple.result     | 387 +++++++++++++++++++
 test/engine/tuple.test.lua   | 109 ++++++
 21 files changed, 1996 insertions(+), 338 deletions(-)

diff --git a/src/box/errcode.h b/src/box/errcode.h
index 4115e6b..464f413 100644
--- a/src/box/errcode.h
+++ b/src/box/errcode.h
@@ -107,7 +107,7 @@ struct errcode_record {
 	/* 52 */_(ER_FUNCTION_EXISTS,		"Function '%s' already exists") \
 	/* 53 */_(ER_BEFORE_REPLACE_RET,	"Invalid return value of space:before_replace trigger: expected tuple or nil, got %s") \
 	/* 54 */_(ER_FUNCTION_MAX,		"A limit on the total number of functions has been reached: %u") \
-	/* 55 */_(ER_UNUSED4,			"") \
+	/* 55 */_(ER_DATA_STRUCTURE_MISMATCH,	"Tuple doesn't math document structure: %s") \
 	/* 56 */_(ER_USER_MAX,			"A limit on the total number of users has been reached: %u") \
 	/* 57 */_(ER_NO_SUCH_ENGINE,		"Space engine '%s' does not exist") \
 	/* 58 */_(ER_RELOAD_CFG,		"Can't set option '%s' dynamically") \
diff --git a/src/box/index_def.c b/src/box/index_def.c
index 9cda63c..f67b952 100644
--- a/src/box/index_def.c
+++ b/src/box/index_def.c
@@ -209,8 +209,14 @@ index_def_is_valid(struct index_def *index_def, const char *space_name)
 			 * Courtesy to a user who could have made
 			 * a typo.
 			 */
-			if (index_def->key_def->parts[i].fieldno ==
-			    index_def->key_def->parts[j].fieldno) {
+			struct key_part *part_a = &index_def->key_def->parts[i];
+			struct key_part *part_b = &index_def->key_def->parts[j];
+			if ((part_a->fieldno == part_b->fieldno &&
+			     part_a->path == NULL && part_b->path == NULL) ||
+			    (part_a->path_len != 0 &&
+			     part_a->path_len == part_b->path_len &&
+			     memcmp(part_a->path, part_b->path,
+				    part_a->path_len) == 0)) {
 				diag_set(ClientError, ER_MODIFY_INDEX,
 					 index_def->name, space_name,
 					 "same key part is indexed twice");
diff --git a/src/box/key_def.c b/src/box/key_def.c
index e8a6fa4..5366eb7 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -35,12 +35,16 @@
 #include "column_mask.h"
 #include "schema_def.h"
 #include "coll_id_cache.h"
+#include "fiber.h"
+#include "assoc.h"
+#include "json/path.h"
 
 static const struct key_part_def key_part_def_default = {
 	0,
 	field_type_MAX,
 	COLL_NONE,
 	false,
+	NULL
 };
 
 static int64_t
@@ -53,6 +57,7 @@ part_type_by_name_wrapper(const char *str, uint32_t len)
 #define PART_OPT_FIELD		"field"
 #define PART_OPT_COLLATION	"collation"
 #define PART_OPT_NULLABILITY	"is_nullable"
+#define PART_OPT_PATH		"path"
 
 const struct opt_def part_def_reg[] = {
 	OPT_DEF_ENUM(PART_OPT_TYPE, field_type, struct key_part_def, type,
@@ -61,6 +66,7 @@ const struct opt_def part_def_reg[] = {
 	OPT_DEF(PART_OPT_COLLATION, OPT_UINT32, struct key_part_def, coll_id),
 	OPT_DEF(PART_OPT_NULLABILITY, OPT_BOOL, struct key_part_def,
 		is_nullable),
+	OPT_DEF(PART_OPT_PATH, OPT_STRPTR, struct key_part_def, path),
 	OPT_END,
 };
 
@@ -96,13 +102,24 @@ const uint32_t key_mp_type[] = {
 struct key_def *
 key_def_dup(const struct key_def *src)
 {
-	size_t sz = key_def_sizeof(src->part_count);
+	const struct key_part *parts = src->parts;
+	const struct key_part *parts_end = parts + src->part_count;
+	size_t sz = 0;
+	for (; parts < parts_end; parts++)
+		sz += parts->path != NULL ? parts->path_len + 1 : 0;
+	sz = key_def_sizeof(src->part_count, sz);
 	struct key_def *res = (struct key_def *)malloc(sz);
 	if (res == NULL) {
 		diag_set(OutOfMemory, sz, "malloc", "res");
 		return NULL;
 	}
 	memcpy(res, src, sz);
+	for (uint32_t i = 0; i < src->part_count; i++) {
+		if (src->parts[i].path == NULL)
+			continue;
+		size_t path_offset = src->parts[i].path - (char *)src;
+		res->parts[i].path = (char *)res + path_offset;
+	}
 	return res;
 }
 
@@ -110,8 +127,17 @@ void
 key_def_swap(struct key_def *old_def, struct key_def *new_def)
 {
 	assert(old_def->part_count == new_def->part_count);
-	for (uint32_t i = 0; i < new_def->part_count; i++)
-		SWAP(old_def->parts[i], new_def->parts[i]);
+	for (uint32_t i = 0; i < new_def->part_count; i++) {
+		if (old_def->parts[i].path == NULL) {
+			SWAP(old_def->parts[i], new_def->parts[i]);
+		} else {
+			size_t path_offset =
+				old_def->parts[i].path - (char *)old_def;
+			SWAP(old_def->parts[i], new_def->parts[i]);
+			old_def->parts[i].path = (char *)old_def + path_offset;
+			new_def->parts[i].path = (char *)new_def + path_offset;
+		}
+	}
 	SWAP(*old_def, *new_def);
 }
 
@@ -131,9 +157,9 @@ key_def_set_cmp(struct key_def *def)
 }
 
 struct key_def *
-key_def_new(uint32_t part_count)
+key_def_new(uint32_t part_count, size_t paths_size)
 {
-	size_t sz = key_def_sizeof(part_count);
+	size_t sz = key_def_sizeof(part_count, paths_size);
 	/** Use calloc() to zero comparator function pointers. */
 	struct key_def *key_def = (struct key_def *) calloc(1, sz);
 	if (key_def == NULL) {
@@ -148,10 +174,13 @@ key_def_new(uint32_t part_count)
 struct key_def *
 key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count)
 {
-	struct key_def *def = key_def_new(part_count);
+	size_t sz = 0;
+	for (uint32_t i = 0; i < part_count; i++)
+		sz += parts[i].path != NULL ? strlen(parts[i].path) + 1 : 0;
+	struct key_def *def = key_def_new(part_count, sz);
 	if (def == NULL)
 		return NULL;
-
+	char *data = (char *)def + key_def_sizeof(part_count, 0);
 	for (uint32_t i = 0; i < part_count; i++) {
 		struct key_part_def *part = &parts[i];
 		struct coll *coll = NULL;
@@ -165,14 +194,22 @@ key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count)
 			}
 			coll = coll_id->coll;
 		}
+		uint32_t path_len = 0;
+		if (part->path != NULL) {
+			path_len = strlen(part->path);
+			def->parts[i].path = data;
+			data += path_len + 1;
+		}
 		key_def_set_part(def, i, part->fieldno, part->type,
-				 part->is_nullable, coll, part->coll_id);
+				 part->is_nullable, coll, part->coll_id,
+				 part->path, path_len);
 	}
 	return def;
 }
 
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
+int
+key_def_dump_parts(struct region *pool, const struct key_def *def,
+		   struct key_part_def *parts)
 {
 	for (uint32_t i = 0; i < def->part_count; i++) {
 		const struct key_part *part = &def->parts[i];
@@ -181,13 +218,26 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
 		part_def->type = part->type;
 		part_def->is_nullable = part->is_nullable;
 		part_def->coll_id = part->coll_id;
+		if (part->path != NULL) {
+			part_def->path = region_alloc(pool, part->path_len + 1);
+			if (part_def->path == NULL) {
+				diag_set(OutOfMemory, part->path_len + 1,
+					 "region_alloc", "part_def->path");
+				return -1;
+			}
+			memcpy(part_def->path, part->path, part->path_len);
+			part_def->path[part->path_len] = '\0';
+		} else {
+			part_def->path = NULL;
+		}
 	}
+	return 0;
 }
 
 box_key_def_t *
 box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 {
-	struct key_def *key_def = key_def_new(part_count);
+	struct key_def *key_def = key_def_new(part_count, 0);
 	if (key_def == NULL)
 		return key_def;
 
@@ -195,7 +245,7 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 		key_def_set_part(key_def, item, fields[item],
 				 (enum field_type)types[item],
 				 key_part_def_default.is_nullable, NULL,
-				 COLL_NONE);
+				 COLL_NONE, NULL, 0);
 	}
 	return key_def;
 }
@@ -241,6 +291,13 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
 		if (part1->is_nullable != part2->is_nullable)
 			return part1->is_nullable <
 			       part2->is_nullable ? -1 : 1;
+		/* Lexicographic strings order. */
+		if (part1->path_len != part2->path_len)
+			return part1->path_len - part2->path_len;
+		int rc = 0;
+		if ((rc = memcmp(part1->path, part2->path,
+				 part1->path_len)) != 0)
+			return rc;
 	}
 	return part_count1 < part_count2 ? -1 : part_count1 > part_count2;
 }
@@ -248,16 +305,28 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
 void
 key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 		 enum field_type type, bool is_nullable, struct coll *coll,
-		 uint32_t coll_id)
+		 uint32_t coll_id, const char *path, uint32_t path_len)
 {
 	assert(part_no < def->part_count);
 	assert(type < field_type_MAX);
 	def->is_nullable |= is_nullable;
+	def->has_json_paths |= path != NULL;
 	def->parts[part_no].is_nullable = is_nullable;
 	def->parts[part_no].fieldno = fieldno;
 	def->parts[part_no].type = type;
 	def->parts[part_no].coll = coll;
 	def->parts[part_no].coll_id = coll_id;
+	if (path != NULL) {
+		def->parts[part_no].path_len = path_len;
+		assert(def->parts[part_no].path != NULL);
+		memcpy(def->parts[part_no].path, path, path_len);
+		def->parts[part_no].path[path_len] = '\0';
+		def->parts[part_no].path_hash = mh_strn_hash(path, path_len);
+	} else {
+		def->parts[part_no].path_len = 0;
+		def->parts[part_no].path = NULL;
+		def->parts[part_no].path_hash = 0;
+	}
 	column_mask_set_fieldno(&def->column_mask, fieldno);
 	/**
 	 * When all parts are set, initialize the tuple
@@ -282,13 +351,13 @@ key_def_update_optionality(struct key_def *def, uint32_t min_field_count)
 	for (uint32_t i = 0; i < def->part_count; ++i) {
 		struct key_part *part = &def->parts[i];
 		def->has_optional_parts |= part->is_nullable &&
-					   min_field_count < part->fieldno + 1;
+					   (min_field_count <
+					   part->fieldno + 1);
 		/*
 		 * One optional part is enough to switch to new
 		 * comparators.
 		 */
-		if (def->has_optional_parts)
-			break;
+				       if (def->has_optional_parts) break;
 	}
 	key_def_set_cmp(def);
 }
@@ -302,8 +371,15 @@ key_def_snprint_parts(char *buf, int size, const struct key_part_def *parts,
 	for (uint32_t i = 0; i < part_count; i++) {
 		const struct key_part_def *part = &parts[i];
 		assert(part->type < field_type_MAX);
-		SNPRINT(total, snprintf, buf, size, "%d, '%s'",
-			(int)part->fieldno, field_type_strs[part->type]);
+		if (part->path != NULL) {
+			SNPRINT(total, snprintf, buf, size, "%d, '%s', '%s'",
+				(int) part->fieldno, part->path,
+				field_type_strs[part->type]);
+		} else {
+			SNPRINT(total, snprintf, buf, size, "%d, '%s'",
+				(int) part->fieldno,
+				field_type_strs[part->type]);
+		}
 		if (i < part_count - 1)
 			SNPRINT(total, snprintf, buf, size, ", ");
 	}
@@ -322,6 +398,8 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
 			count++;
 		if (part->is_nullable)
 			count++;
+		if (part->path != NULL)
+			count++;
 		size += mp_sizeof_map(count);
 		size += mp_sizeof_str(strlen(PART_OPT_FIELD));
 		size += mp_sizeof_uint(part->fieldno);
@@ -336,6 +414,10 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
 			size += mp_sizeof_str(strlen(PART_OPT_NULLABILITY));
 			size += mp_sizeof_bool(part->is_nullable);
 		}
+		if (part->path != NULL) {
+			size += mp_sizeof_str(strlen(PART_OPT_PATH));
+			size += mp_sizeof_str(strlen(part->path));
+		}
 	}
 	return size;
 }
@@ -349,6 +431,8 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
 		int count = 2;
 		if (part->coll_id != COLL_NONE)
 			count++;
+		if (part->path != NULL)
+			count++;
 		if (part->is_nullable)
 			count++;
 		data = mp_encode_map(data, count);
@@ -370,6 +454,12 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
 					     strlen(PART_OPT_NULLABILITY));
 			data = mp_encode_bool(data, part->is_nullable);
 		}
+		if (part->path != NULL) {
+			data = mp_encode_str(data, PART_OPT_PATH,
+					     strlen(PART_OPT_PATH));
+			data = mp_encode_str(data, part->path,
+					     strlen(part->path));
+		}
 	}
 	return data;
 }
@@ -430,8 +520,111 @@ key_def_decode_parts_166(struct key_part_def *parts, uint32_t part_count,
 				     fields[part->fieldno].is_nullable :
 				     key_part_def_default.is_nullable);
 		part->coll_id = COLL_NONE;
+		part->path = NULL;
+	}
+	return 0;
+}
+
+/**
+ * Verify key_part JSON path and convert to canonical form.
+ *
+ * @param region Region to make allocations.
+ * @param part Part with path to update.
+ * @param path_extra Extra allocated space to reuse if possible.
+ * @param path_extra_size The @path_extra size.
+ *
+ * @retval -1 on error.
+ * @retval 0 on success.
+ */
+static int
+key_def_normalize_json_path(struct region *region, struct key_part_def *part,
+			    char **path_extra, uint32_t *path_extra_size)
+{
+	uint32_t allocated_size = *path_extra_size;
+	char *path = *path_extra;
+
+	uint32_t path_len = strlen(part->path);
+	struct json_path_parser parser;
+	struct json_path_node node;
+	json_path_parser_create(&parser, part->path, path_len);
+	/*
+	 * A worst-case scenario is .a -> ["a"]
+	 * i.e. 2.5 * path_len + 1 is enough.
+	 */
+	uint32_t new_path_size = 2.5 * path_len + 1;
+	if (new_path_size >= allocated_size) {
+		path = region_alloc(region, new_path_size);
+		if (path == NULL) {
+			diag_set(OutOfMemory, new_path_size,
+				 "region_alloc", "path");
+			return -1;
+		}
+		allocated_size = new_path_size;
+	}
+	assert(path != NULL);
+	part->path = path;
+	int rc = json_path_next(&parser, &node);
+	if (rc != 0)
+		goto error_invalid_json;
+	if (node.type != JSON_PATH_NUM) {
+		diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+			 part->fieldno,
+			 "invalid JSON path: first part should "
+			 "be defined as array index");
+		return -1;
+	}
+	if (node.num - TUPLE_INDEX_BASE != part->fieldno) {
+		diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+			 part->fieldno,
+			 "invalid JSON path: first part refers "
+			 "to invalid field");
+		return -1;
+	}
+	uint32_t lexemes = 0;
+	do {
+		if (node.type == JSON_PATH_NUM) {
+			path += sprintf(path, "[%llu]",
+					(unsigned long long) node.num);
+		} else if (node.type == JSON_PATH_STR) {
+			path += sprintf(path, "[\"%.*s\"]", node.len, node.str);
+		} else {
+			unreachable();
+		}
+		lexemes++;
+	} while ((rc = json_path_next(&parser, &node)) == 0 &&
+		 node.type != JSON_PATH_END);
+	if (rc != 0 || node.type != JSON_PATH_END)
+		goto error_invalid_json;
+	if (lexemes == 1) {
+		/* JSON index is useless. */
+		path = part->path;
+		part->path = NULL;
+	} else {
+		/* Skip terminating zero. */
+		path++;
+		/* Account constructed string size. */
+		allocated_size -= path - part->path;
+	}
+	/* Going to try to reuse extra allocation next time. */
+	if (allocated_size > (uint32_t)parser.src_len) {
+		/* Use rest of new buffer next time. */
+		*path_extra = path;
+		*path_extra_size = allocated_size;
+	} else {
+		/* Reuse old path buffer. */
+		*path_extra = (char *)parser.src;
+		*path_extra_size = parser.src_len;
 	}
 	return 0;
+
+error_invalid_json: ;
+	const char *err_msg =
+		tt_sprintf("invalid JSON path '%.*s': path has invalid "\
+			   "structure (error at position %d)", parser.src_len,
+			   parser.src, rc);
+	diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+		 part->fieldno + TUPLE_INDEX_BASE, err_msg);
+	return -1;
 }
 
 int
@@ -443,8 +636,11 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 		return key_def_decode_parts_166(parts, part_count, data,
 						fields, field_count);
 	}
-	for (uint32_t i = 0; i < part_count; i++) {
-		struct key_part_def *part = &parts[i];
+	char *path = NULL;
+	uint32_t allocated_size = 0;
+	struct key_part_def *part = parts;
+	struct region *region = &fiber()->gc;
+	for (uint32_t i = 0; i < part_count; i++, part++) {
 		if (mp_typeof(**data) != MP_MAP) {
 			diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
 				 i + TUPLE_INDEX_BASE,
@@ -454,7 +650,7 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 		*part = key_part_def_default;
 		if (opts_decode(part, part_def_reg, data,
 				ER_WRONG_INDEX_OPTIONS, i + TUPLE_INDEX_BASE,
-				NULL) != 0)
+				region) != 0)
 			return -1;
 		if (part->type == field_type_MAX) {
 			diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
@@ -471,6 +667,10 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 				 "string and scalar parts");
 			return -1;
 		}
+		if (part->path != NULL &&
+		    key_def_normalize_json_path(region, part, &path,
+						&allocated_size) != 0)
+			return -1;
 	}
 	return 0;
 }
@@ -495,20 +695,25 @@ key_def_decode_parts_160(struct key_part_def *parts, uint32_t part_count,
 				     fields[part->fieldno].is_nullable :
 				     key_part_def_default.is_nullable);
 		part->coll_id = COLL_NONE;
+		part->path = NULL;
 	}
 	return 0;
 }
 
-const struct key_part *
-key_def_find(const struct key_def *key_def, uint32_t fieldno)
+bool
+key_def_contains_part(const struct key_def *key_def,
+		      const struct key_part *to_find)
 {
 	const struct key_part *part = key_def->parts;
 	const struct key_part *end = part + key_def->part_count;
 	for (; part != end; part++) {
-		if (part->fieldno == fieldno)
-			return part;
+		if (part->fieldno == to_find->fieldno &&
+		    part->path_len == to_find->path_len &&
+		    (part->path == NULL || memcmp(part->path, to_find->path,
+						  to_find->path_len) == 0))
+			return true;
 	}
-	return NULL;
+	return false;
 }
 
 bool
@@ -517,7 +722,7 @@ key_def_contains(const struct key_def *first, const struct key_def *second)
 	const struct key_part *part = second->parts;
 	const struct key_part *end = part + second->part_count;
 	for (; part != end; part++) {
-		if (key_def_find(first, part->fieldno) == NULL)
+		if (! key_def_contains_part(first, part))
 			return false;
 	}
 	return true;
@@ -531,18 +736,25 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	 * Find and remove part duplicates, i.e. parts counted
 	 * twice since they are present in both key defs.
 	 */
-	const struct key_part *part = second->parts;
-	const struct key_part *end = part + second->part_count;
+	size_t sz = 0;
+	const struct key_part *part = first->parts;
+	const struct key_part *end = part + first->part_count;
 	for (; part != end; part++) {
-		if (key_def_find(first, part->fieldno))
+		if (part->path != NULL)
+			sz += part->path_len + 1;
+	}
+	part = second->parts;
+	end = part + second->part_count;
+	for (; part != end; part++) {
+		if (key_def_contains_part(first, part))
 			--new_part_count;
+		else if (part->path != NULL)
+			sz += part->path_len + 1;
 	}
-
-	struct key_def *new_def;
-	new_def =  (struct key_def *)calloc(1, key_def_sizeof(new_part_count));
+	sz = key_def_sizeof(new_part_count, sz);
+	struct key_def *new_def = (struct key_def *)calloc(1, sz);
 	if (new_def == NULL) {
-		diag_set(OutOfMemory, key_def_sizeof(new_part_count), "malloc",
-			 "new_def");
+		diag_set(OutOfMemory, sz, "calloc", "new_def");
 		return NULL;
 	}
 	new_def->part_count = new_part_count;
@@ -550,24 +762,36 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	new_def->is_nullable = first->is_nullable || second->is_nullable;
 	new_def->has_optional_parts = first->has_optional_parts ||
 				      second->has_optional_parts;
+	/* Path data write position in the new key_def. */
+	char *data = (char *)new_def + key_def_sizeof(new_part_count, 0);
 	/* Write position in the new key def. */
 	uint32_t pos = 0;
 	/* Append first key def's parts to the new index_def. */
 	part = first->parts;
 	end = part + first->part_count;
 	for (; part != end; part++) {
+		if (part->path != NULL) {
+			new_def->parts[pos].path = data;
+			data += part->path_len + 1;
+		}
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
-				 part->is_nullable, part->coll, part->coll_id);
+				 part->is_nullable, part->coll, part->coll_id,
+				 part->path, part->path_len);
 	}
 
 	/* Set-append second key def's part to the new key def. */
 	part = second->parts;
 	end = part + second->part_count;
 	for (; part != end; part++) {
-		if (key_def_find(first, part->fieldno))
+		if (key_def_contains_part(first, part))
 			continue;
+		if (part->path != NULL) {
+			new_def->parts[pos].path = data;
+			data += part->path_len + 1;
+		}
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
-				 part->is_nullable, part->coll, part->coll_id);
+				 part->is_nullable, part->coll, part->coll_id,
+				 part->path, part->path_len);
 	}
 	return new_def;
 }
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 6cc705b..d451258 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -54,6 +54,8 @@ struct key_part_def {
 	uint32_t coll_id;
 	/** True if a key part can store NULLs. */
 	bool is_nullable;
+	/** JSON path to data. */
+	char *path;
 };
 
 /**
@@ -74,6 +76,12 @@ struct key_part {
 	struct coll *coll;
 	/** True if a part can store NULLs. */
 	bool is_nullable;
+	/** JSON path to data in canonical form. */
+	char *path;
+	/** JSON path length. */
+	uint32_t path_len;
+	/** JSON path hash. */
+	uint32_t path_hash;
 };
 
 struct key_def;
@@ -133,6 +141,10 @@ struct key_def {
 	 * fields assumed to be MP_NIL.
 	 */
 	bool has_optional_parts;
+	/**
+	 * True, if some key part contain JSON path.
+	 */
+	bool has_json_paths;
 	/** Key fields mask. @sa column_mask.h for details. */
 	uint64_t column_mask;
 	/** The size of the 'parts' array. */
@@ -221,16 +233,17 @@ box_tuple_compare_with_key(const box_tuple_t *tuple_a, const char *key_b,
 /** \endcond public */
 
 static inline size_t
-key_def_sizeof(uint32_t part_count)
+key_def_sizeof(uint32_t part_count, size_t paths_size)
 {
-	return sizeof(struct key_def) + sizeof(struct key_part) * part_count;
+	return sizeof(struct key_def) + sizeof(struct key_part) * part_count +
+	       paths_size;
 }
 
 /**
  * Allocate a new key_def with the given part count.
  */
 struct key_def *
-key_def_new(uint32_t part_count);
+key_def_new(uint32_t part_count, size_t paths_size);
 
 /**
  * Allocate a new key_def with the given part count
@@ -242,8 +255,9 @@ key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count);
 /**
  * Dump part definitions of the given key def.
  */
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
+int
+key_def_dump_parts(struct region *pool, const struct key_def *def,
+		   struct key_part_def *parts);
 
 /**
  * Set a single key part in a key def.
@@ -252,7 +266,7 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
 void
 key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 		 enum field_type type, bool is_nullable, struct coll *coll,
-		 uint32_t coll_id);
+		 uint32_t coll_id, const char *path, uint32_t path_len);
 
 /**
  * Update 'has_optional_parts' of @a key_def with correspondence
@@ -314,12 +328,10 @@ key_def_decode_parts_160(struct key_part_def *parts, uint32_t part_count,
 			 const char **data, const struct field_def *fields,
 			 uint32_t field_count);
 
-/**
- * Returns the part in index_def->parts for the specified fieldno.
- * If fieldno is not in index_def->parts returns NULL.
- */
-const struct key_part *
-key_def_find(const struct key_def *key_def, uint32_t fieldno);
+/** Check if @a key_def contains @a to_find part. */
+bool
+key_def_contains_part(const struct key_def *key_def,
+		      const struct key_part *to_find);
 
 /**
  * Check if key definition @a first contains all parts of
@@ -366,6 +378,8 @@ key_validate_parts(const struct key_def *key_def, const char *key,
 static inline bool
 key_def_is_sequential(const struct key_def *key_def)
 {
+	if (key_def->has_json_paths)
+		return false;
 	for (uint32_t part_id = 0; part_id < key_def->part_count; part_id++) {
 		if (key_def->parts[part_id].fieldno != part_id)
 			return false;
diff --git a/src/box/lua/space.cc b/src/box/lua/space.cc
index 25b7e36..875e51f 100644
--- a/src/box/lua/space.cc
+++ b/src/box/lua/space.cc
@@ -295,6 +295,11 @@ lbox_fillspace(struct lua_State *L, struct space *space, int i)
 			lua_pushnumber(L, part->fieldno + TUPLE_INDEX_BASE);
 			lua_setfield(L, -2, "fieldno");
 
+			if (part->path != NULL) {
+				lua_pushstring(L, part->path);
+				lua_setfield(L, -2, "path");
+			}
+
 			lua_pushboolean(L, part->is_nullable);
 			lua_setfield(L, -2, "is_nullable");
 
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 1f80ce5..554d6f9 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -1310,6 +1310,11 @@ memtx_index_def_change_requires_rebuild(struct index *index,
 			return true;
 		if (old_part->coll != new_part->coll)
 			return true;
+		if (old_part->path_len != new_part->path_len)
+			return true;
+		if (memcmp(old_part->path, new_part->path,
+			   old_part->path_len) != 0)
+			return true;
 	}
 	return false;
 }
diff --git a/src/box/schema.cc b/src/box/schema.cc
index 7f20f36..2411c8c 100644
--- a/src/box/schema.cc
+++ b/src/box/schema.cc
@@ -286,19 +286,19 @@ schema_init()
 	 * (and re-created) first.
 	 */
 	/* _schema - key/value space with schema description */
-	struct key_def *key_def = key_def_new(1); /* part count */
+	struct key_def *key_def = key_def_new(1, 0);
 	if (key_def == NULL)
 		diag_raise();
 	auto key_def_guard = make_scoped_guard([&] { key_def_delete(key_def); });
 
 	key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
-			 FIELD_TYPE_STRING, false, NULL, COLL_NONE);
+			 FIELD_TYPE_STRING, false, NULL, COLL_NONE, NULL, 0);
 	sc_space_new(BOX_SCHEMA_ID, "_schema", key_def, &on_replace_schema,
 		     NULL);
 
 	/* _space - home for all spaces. */
 	key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
-			 FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
+			 FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE, NULL, 0);
 
 	/* _collation - collation description. */
 	sc_space_new(BOX_COLLATION_ID, "_collation", key_def,
@@ -341,15 +341,15 @@ schema_init()
 		     NULL);
 
 	key_def_delete(key_def);
-	key_def = key_def_new(2); /* part count */
+	key_def = key_def_new(2, 0);
 	if (key_def == NULL)
 		diag_raise();
 	/* space no */
 	key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
-			 FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
+			 FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE, NULL, 0);
 	/* index no */
 	key_def_set_part(key_def, 1 /* part no */, 1 /* field no */,
-			 FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
+			 FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE, NULL, 0);
 	sc_space_new(BOX_INDEX_ID, "_index", key_def,
 		     &alter_space_on_replace_index, &on_stmt_begin_index);
 
diff --git a/src/box/tuple.c b/src/box/tuple.c
index d7dbad3..83bdad1 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -159,14 +159,22 @@ tuple_validate_raw(struct tuple_format *format, const char *tuple)
 
 	/* Check field types */
 	struct tuple_field *field = &format->fields[0];
+	const char *pos = tuple;
 	uint32_t i = 0;
 	uint32_t defined_field_count = MIN(field_count, format->field_count);
 	for (; i < defined_field_count; ++i, ++field) {
-		if (key_mp_type_validate(field->type, mp_typeof(*tuple),
+		if (key_mp_type_validate(field->type, mp_typeof(*pos),
 					 ER_FIELD_TYPE, i + TUPLE_INDEX_BASE,
 					 field->is_nullable))
 			return -1;
-		mp_next(&tuple);
+		/* Check all JSON paths. */
+		if (field->childs != NULL) {
+			if (tuple_field_bypass_and_init(field, i, tuple, &pos,
+							NULL) != 0)
+				return -1;
+		} else {
+			mp_next(&pos);
+		}
 	}
 	return 0;
 }
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index e21b009..5a3a968 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -458,18 +458,20 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	return i;
 }
 
-template<bool is_nullable, bool has_optional_parts>
+template<bool is_nullable, bool has_optional_parts, bool has_json_path>
 static inline int
 tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		       struct key_def *key_def)
 {
+	assert(has_json_path == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
-	if (key_def->part_count == 1 && part->fieldno == 0) {
+	if (key_def->part_count == 1 && part->fieldno == 0 &&
+	    part->path == NULL) {
 		/*
 		 * First field can not be optional - empty tuples
 		 * can not exist.
@@ -507,10 +509,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		end = part + key_def->part_count;
 
 	for (; part < end; part++) {
-		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
-						  field_map_a, part);
-		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
-						  field_map_b, part);
+		if (!has_json_path) {
+			field_a = tuple_field_raw(format_a, tuple_a_raw,
+						  field_map_a,
+						  part->fieldno);
+			field_b = tuple_field_raw(format_b, tuple_b_raw,
+						  field_map_b,
+						  part->fieldno);
+		} else {
+			field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+							  field_map_a, part);
+			field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+							  field_map_b, part);
+		}
 		assert(has_optional_parts ||
 		       (field_a != NULL && field_b != NULL));
 		if (! is_nullable) {
@@ -557,10 +568,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	 */
 	end = key_def->parts + key_def->part_count;
 	for (; part < end; ++part) {
-		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
-						  field_map_a, part);
-		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
-						  field_map_b, part);
+		if (!has_json_path) {
+			field_a = tuple_field_raw(format_a, tuple_a_raw,
+						  field_map_a,
+						  part->fieldno);
+			field_b = tuple_field_raw(format_b, tuple_b_raw,
+						  field_map_b,
+						  part->fieldno);
+		} else {
+			field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+							  field_map_a, part);
+			field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+							  field_map_b, part);
+		}
 		/*
 		 * Extended parts are primary, and they can not
 		 * be absent or be NULLs.
@@ -574,11 +594,12 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	return 0;
 }
 
-template<bool is_nullable, bool has_optional_parts>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
 tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 				uint32_t part_count, struct key_def *key_def)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
@@ -590,9 +611,14 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	const uint32_t *field_map = tuple_field_map(tuple);
 	enum mp_type a_type, b_type;
 	if (likely(part_count == 1)) {
-		const char *field =
-			tuple_field_by_part_raw(format, tuple_raw, field_map,
-						part);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, tuple_raw, field_map,
+						part->fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, tuple_raw,
+							field_map, part);
+		}
 		if (! is_nullable) {
 			return tuple_compare_field(field, key, part->type,
 						   part->coll);
@@ -616,9 +642,14 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	struct key_part *end = part + part_count;
 	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
-		const char *field =
-			tuple_field_by_part_raw(format, tuple_raw,
-						field_map, part);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, tuple_raw, field_map,
+						part->fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, tuple_raw,
+							field_map, part);
+		}
 		if (! is_nullable) {
 			rc = tuple_compare_field(field, key, part->type,
 						 part->coll);
@@ -1011,23 +1042,35 @@ static const comparator_signature cmp_arr[] = {
 
 #undef COMPARATOR
 
+static const tuple_compare_t compare_slowpath_funcs[] = {
+	tuple_compare_slowpath<false, false, false>,
+	tuple_compare_slowpath<true, false, false>,
+	tuple_compare_slowpath<false, true, false>,
+	tuple_compare_slowpath<true, true, false>,
+	tuple_compare_slowpath<false, false, true>,
+	tuple_compare_slowpath<true, false, true>,
+	tuple_compare_slowpath<false, true, true>,
+	tuple_compare_slowpath<true, true, true>
+};
+
 tuple_compare_t
 tuple_compare_create(const struct key_def *def)
 {
+	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+			   2 * (def->has_optional_parts ? 1 : 0) +
+			   4 * (def->has_json_paths ? 1 : 0);
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts)
 				return tuple_compare_sequential<true, true>;
 			else
 				return tuple_compare_sequential<true, false>;
-		} else if (def->has_optional_parts) {
-			return tuple_compare_slowpath<true, true>;
 		} else {
-			return tuple_compare_slowpath<true, false>;
+			return compare_slowpath_funcs[cmp_func_idx];
 		}
 	}
 	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def)) {
+	if (!key_def_has_collation(def) && !def->has_json_paths) {
 		/* Precalculated comparators don't use collation */
 		for (uint32_t k = 0;
 		     k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
@@ -1043,10 +1086,9 @@ tuple_compare_create(const struct key_def *def)
 				return cmp_arr[k].f;
 		}
 	}
-	if (key_def_is_sequential(def))
-		return tuple_compare_sequential<false, false>;
-	else
-		return tuple_compare_slowpath<false, false>;
+	return key_def_is_sequential(def) ?
+	       tuple_compare_sequential<false, false> :
+	       compare_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare */
@@ -1228,9 +1270,23 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
 
 #undef KEY_COMPARATOR
 
+static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
+	tuple_compare_with_key_slowpath<false, false, false>,
+	tuple_compare_with_key_slowpath<true, false, false>,
+	tuple_compare_with_key_slowpath<false, true, false>,
+	tuple_compare_with_key_slowpath<true, true, false>,
+	tuple_compare_with_key_slowpath<false, false, true>,
+	tuple_compare_with_key_slowpath<true, false, true>,
+	tuple_compare_with_key_slowpath<false, true, true>,
+	tuple_compare_with_key_slowpath<true, true, true>
+};
+
 tuple_compare_with_key_t
 tuple_compare_with_key_create(const struct key_def *def)
 {
+	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+			   2 * (def->has_optional_parts ? 1 : 0) +
+			   4 * (def->has_json_paths ? 1 : 0);
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts) {
@@ -1240,14 +1296,12 @@ tuple_compare_with_key_create(const struct key_def *def)
 				return tuple_compare_with_key_sequential<true,
 									 false>;
 			}
-		} else if (def->has_optional_parts) {
-			return tuple_compare_with_key_slowpath<true, true>;
 		} else {
-			return tuple_compare_with_key_slowpath<true, false>;
+			return compare_with_key_slowpath_funcs[cmp_func_idx];
 		}
 	}
 	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def)) {
+	if (!key_def_has_collation(def) && !def->has_json_paths) {
 		/* Precalculated comparators don't use collation */
 		for (uint32_t k = 0;
 		     k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
@@ -1266,10 +1320,9 @@ tuple_compare_with_key_create(const struct key_def *def)
 				return cmp_wk_arr[k].f;
 		}
 	}
-	if (key_def_is_sequential(def))
-		return tuple_compare_with_key_sequential<false, false>;
-	else
-		return tuple_compare_with_key_slowpath<false, false>;
+	return key_def_is_sequential(def) ?
+	       tuple_compare_with_key_sequential<false, false> :
+	       compare_with_key_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare_with_key */
diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index e493c3b..2474d98 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -1,15 +1,31 @@
 #include "tuple_extract_key.h"
 #include "tuple.h"
 #include "fiber.h"
+#include "json/path.h"
 
 enum { MSGPACK_NULL = 0xc0 };
 
+/** True if key part i and i+1 are sequential. */
+template <bool has_json_paths>
+static inline bool
+key_def_parts_are_sequential(const struct key_def *def, int i)
+{
+	uint32_t fieldno1 = def->parts[i].fieldno + 1;
+	uint32_t fieldno2 = def->parts[i + 1].fieldno;
+	if (!has_json_paths) {
+		return fieldno1 == fieldno2;
+	} else {
+		return fieldno1 == fieldno2 && def->parts[i].path == NULL &&
+		       def->parts[i + 1].path == NULL;
+	}
+}
+
 /** True, if a key con contain two or more parts in sequence. */
 static bool
 key_def_contains_sequential_parts(const struct key_def *def)
 {
 	for (uint32_t i = 0; i < def->part_count - 1; ++i) {
-		if (def->parts[i].fieldno + 1 == def->parts[i + 1].fieldno)
+		if (key_def_parts_are_sequential<true>(def, i))
 			return true;
 	}
 	return false;
@@ -89,11 +105,13 @@ tuple_extract_key_sequential(const struct tuple *tuple, struct key_def *key_def,
  * General-purpose implementation of tuple_extract_key()
  * @copydoc tuple_extract_key()
  */
-template <bool contains_sequential_parts, bool has_optional_parts>
+template <bool contains_sequential_parts, bool has_optional_parts,
+          bool has_json_paths>
 static char *
 tuple_extract_key_slowpath(const struct tuple *tuple,
 			   struct key_def *key_def, uint32_t *key_size)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(contains_sequential_parts ==
@@ -108,9 +126,14 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 
 	/* Calculate the key size. */
 	for (uint32_t i = 0; i < part_count; ++i) {
-		const char *field =
-			tuple_field_by_part_raw(format, data, field_map,
-						&key_def->parts[i]);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, data, field_map,
+						key_def->parts[i].fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, data, field_map,
+							&key_def->parts[i]);
+		}
 		if (has_optional_parts && field == NULL) {
 			bsize += mp_sizeof_nil();
 			continue;
@@ -123,8 +146,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 			 * minimize tuple_field_raw() calls.
 			 */
 			for (; i < part_count - 1; i++) {
-				if (key_def->parts[i].fieldno + 1 !=
-				    key_def->parts[i + 1].fieldno) {
+				if (!key_def_parts_are_sequential
+				        <has_json_paths>(key_def, i)) {
 					/*
 					 * End of sequential part.
 					 */
@@ -150,9 +173,14 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 	}
 	char *key_buf = mp_encode_array(key, part_count);
 	for (uint32_t i = 0; i < part_count; ++i) {
-		const char *field =
-			tuple_field_by_part_raw(format, data, field_map,
-						&key_def->parts[i]);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, data, field_map,
+						key_def->parts[i].fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, data, field_map,
+							&key_def->parts[i]);
+		}
 		if (has_optional_parts && field == NULL) {
 			key_buf = mp_encode_nil(key_buf);
 			continue;
@@ -165,8 +193,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 			 * minimize tuple_field_raw() calls.
 			 */
 			for (; i < part_count - 1; i++) {
-				if (key_def->parts[i].fieldno + 1 !=
-				    key_def->parts[i + 1].fieldno) {
+				if (!key_def_parts_are_sequential
+				        <has_json_paths>(key_def, i)) {
 					/*
 					 * End of sequential part.
 					 */
@@ -199,11 +227,12 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
  * General-purpose version of tuple_extract_key_raw()
  * @copydoc tuple_extract_key_raw()
  */
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 static char *
 tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 			       struct key_def *key_def, uint32_t *key_size)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(mp_sizeof_nil() == 1);
@@ -231,11 +260,12 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 		uint32_t fieldno = key_def->parts[i].fieldno;
 		uint32_t null_count = 0;
 		for (; i < key_def->part_count - 1; i++) {
-			if (key_def->parts[i].fieldno + 1 !=
-			    key_def->parts[i + 1].fieldno)
+			if (!key_def_parts_are_sequential
+			        <has_json_paths>(key_def, i))
 				break;
 		}
-		uint32_t end_fieldno = key_def->parts[i].fieldno;
+		const struct key_part *part = &key_def->parts[i];
+		uint32_t end_fieldno = part->fieldno;
 
 		if (fieldno < current_fieldno) {
 			/* Rewind. */
@@ -277,6 +307,21 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 				current_fieldno++;
 			}
 		}
+		const char *field_last, *field_end_last;
+		if (has_json_paths && part->path != NULL) {
+			field_last = field;
+			field_end_last = field_end;
+			struct json_path_parser parser;
+			struct json_path_node node;
+			json_path_parser_create(&parser, part->path,
+						part->path_len);
+			/* Skip fieldno. */
+			int rc = json_path_next(&parser, &node);
+			assert(rc == 0);
+			rc = tuple_field_dig_with_parser(&parser, &field);
+			field_end = field;
+			mp_next(&field_end);
+		}
 		memcpy(key_buf, field, field_end - field);
 		key_buf += field_end - field;
 		if (has_optional_parts && null_count != 0) {
@@ -285,12 +330,27 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 		} else {
 			assert(key_buf - key <= data_end - data);
 		}
+		if (has_json_paths && part->path != NULL) {
+			field = field_last;
+			field_end = field_end_last;
+		}
 	}
 	if (key_size != NULL)
 		*key_size = (uint32_t)(key_buf - key);
 	return key;
 }
 
+static const tuple_extract_key_t extract_key_slowpath_funcs[] = {
+	tuple_extract_key_slowpath<false, false, false>,
+	tuple_extract_key_slowpath<true, false, false>,
+	tuple_extract_key_slowpath<false, true, false>,
+	tuple_extract_key_slowpath<true, true, false>,
+	tuple_extract_key_slowpath<false, false, true>,
+	tuple_extract_key_slowpath<true, false, true>,
+	tuple_extract_key_slowpath<false, true, true>,
+	tuple_extract_key_slowpath<true, true, true>
+};
+
 /**
  * Initialize tuple_extract_key() and tuple_extract_key_raw()
  */
@@ -311,32 +371,30 @@ tuple_extract_key_set(struct key_def *key_def)
 				tuple_extract_key_sequential_raw<false>;
 		}
 	} else {
-		if (key_def->has_optional_parts) {
-			assert(key_def->is_nullable);
-			if (key_def_contains_sequential_parts(key_def)) {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<true, true>;
-			} else {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<false, true>;
-			}
-		} else {
-			if (key_def_contains_sequential_parts(key_def)) {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<true, false>;
-			} else {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<false,
-								   false>;
-			}
-		}
+		int func_idx =
+			(key_def_contains_sequential_parts(key_def) ? 1 : 0) +
+			2 * (key_def->has_optional_parts ? 1 : 0) +
+			4 * (key_def->has_json_paths ? 1 : 0);
+		key_def->tuple_extract_key =
+			extract_key_slowpath_funcs[func_idx];
+		assert(!key_def->has_optional_parts || key_def->is_nullable);
 	}
 	if (key_def->has_optional_parts) {
 		assert(key_def->is_nullable);
-		key_def->tuple_extract_key_raw =
-			tuple_extract_key_slowpath_raw<true>;
+		if (key_def->has_json_paths) {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<true, true>;
+		} else {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<true, false>;
+		}
 	} else {
-		key_def->tuple_extract_key_raw =
-			tuple_extract_key_slowpath_raw<false>;
+		if (key_def->has_json_paths) {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<false, true>;
+		} else {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<false, false>;
+		}
 	}
 }
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 91677d4..fa8b938 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -30,6 +30,7 @@
  */
 #include "json/path.h"
 #include "tuple_format.h"
+#include "assoc.h"
 
 /** Global table of tuple formats */
 struct tuple_format **tuple_formats;
@@ -38,10 +39,551 @@ static intptr_t recycled_format_ids = FORMAT_ID_NIL;
 static uint32_t formats_size = 0, formats_capacity = 0;
 
 static const struct tuple_field tuple_field_default = {
-	FIELD_TYPE_ANY, TUPLE_OFFSET_SLOT_NIL, false, false,
+	FIELD_TYPE_ANY, TUPLE_OFFSET_SLOT_NIL, false, false, {{NULL, 0}}
 };
 
 /**
+ * Propagate @a field to MessagePack(field)[key].
+ * @param[in][out] field Field to propagate.
+ * @param key Key to propagate to.
+ * @param len Length of @a key.
+ * @param field_idx Field index in map.
+ *
+ * @retval  0 Success, the index was found.
+ * @retval -1 Not found.
+ */
+static inline int
+tuple_field_go_to_key(const char **field, const char *key, int len,
+		      uint32_t *field_idx)
+{
+	enum mp_type type = mp_typeof(**field);
+	if (type != MP_MAP)
+		return -1;
+	uint32_t count = mp_decode_map(field);
+	for (uint32_t idx = 0; idx < count; idx++) {
+		type = mp_typeof(**field);
+		if (type == MP_STR) {
+			uint32_t value_len;
+			const char *value = mp_decode_str(field, &value_len);
+			if (value_len == (uint)len &&
+			    memcmp(value, key, len) == 0) {
+				*field_idx = idx;
+				return 0;
+			}
+		} else {
+			/* Skip key. */
+			mp_next(field);
+		}
+		/* Skip value. */
+		mp_next(field);
+	}
+	return -1;
+}
+
+struct mh_strnptr_node_t *
+json_path_hash_get(struct mh_strnptr_t *hashtable, const char *path,
+		   uint32_t path_len, uint32_t path_hash)
+{
+	assert(hashtable != NULL);
+	struct mh_strnptr_key_t key = {path, path_len, path_hash};
+	mh_int_t rc = mh_strnptr_find(hashtable, &key, NULL);
+	if (rc == mh_end(hashtable))
+		return NULL;
+	return mh_strnptr_node(hashtable, rc);
+}
+
+/**
+ * Create a new hashtable object.
+ * @param[out] hashtable Pointer to object to create.
+ * @param records Count of records to reserve.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static struct mh_strnptr_t *
+json_path_hash_create(uint32_t records)
+{
+	struct mh_strnptr_t *ret = mh_strnptr_new();
+	if (ret == NULL) {
+		diag_set(OutOfMemory, sizeof(struct mh_strnptr_t),
+			"mh_strnptr_new", "hashtable");
+		return NULL;
+	}
+	if (records > 0 &&
+	    mh_strnptr_reserve(ret, records, NULL) != 0) {
+		mh_strnptr_delete(ret);
+		diag_set(OutOfMemory, records, "mh_strnptr_reserve",
+			 "hashtable");
+		return NULL;
+	}
+	return ret;
+}
+
+/**
+ * Delete @hashtable object.
+ * @param hashtable Pointer to object to delete.
+ */
+static void
+json_path_hash_delete(struct mh_strnptr_t *hashtable)
+{
+	assert(hashtable != NULL);
+	while (mh_size(hashtable) != 0) {
+		mh_int_t n = mh_first(hashtable);
+		mh_strnptr_del(hashtable, n, NULL);
+	}
+	mh_strnptr_delete(hashtable);
+}
+
+/**
+ * Insert a new record to hashtable.
+ * @param hashtable Storage to insert new record.
+ * @param path String with path.
+ * @param path_len Length of @path.
+ * @param field Value to store in @hashtable.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+json_path_hash_insert(struct mh_strnptr_t *hashtable, const char *path,
+		      uint32_t path_len, struct tuple_field *field)
+{
+	assert(hashtable != NULL);
+	uint32_t path_hash = mh_strn_hash(path, path_len);
+	struct mh_strnptr_node_t name_node = {path, path_len, path_hash, field};
+	mh_int_t rc = mh_strnptr_put(hashtable, &name_node, NULL, NULL);
+	if (rc == mh_end(hashtable)) {
+		diag_set(OutOfMemory, sizeof(*hashtable), "mh_strnptr_put",
+			"hashtable");
+		return -1;
+	}
+	return 0;
+}
+
+/**
+ * Construct field tree level for JSON path part.
+ *
+ * @param[in, out] tuple_field Pointer to record to start with
+ *                 would be changed to record that math
+ *                 @part lexeme.
+ * @param fieldno Number of root space field.
+ * @param part JSON path lexeme to represent in field tree.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+json_field_tree_append(struct tuple_field **field_subtree, uint32_t fieldno,
+		       struct json_path_node *part)
+{
+	enum field_type type;
+	struct tuple_field *field = *field_subtree;
+	switch (part->type) {
+	case JSON_PATH_NUM: {
+		type = FIELD_TYPE_ARRAY;
+		if (field->type != FIELD_TYPE_ANY && field->type != type)
+			goto error_type_mistmatch;
+		/* Create or resize field array if required. */
+		if (field->array == NULL || part->num > field->array_size) {
+			struct tuple_field **array =
+				realloc(field->array,
+					part->num * sizeof(array[0]));
+			if (array == NULL) {
+				diag_set(OutOfMemory,
+					 part->num * sizeof(array[0]),
+					 "realloc","array");
+				return -1;
+			}
+			memset(&array[field->array_size], 0,
+				(part->num - field->array_size) *
+				sizeof(array[0]));
+			field->array = array;
+			field->array_size = part->num;
+			field->type = type;
+		} else if (field->array[part->num - TUPLE_INDEX_BASE] != NULL) {
+			/* Record already exists. No actions required */
+			*field_subtree =
+				field->array[part->num - TUPLE_INDEX_BASE];
+			return 0;
+		}
+		break;
+	}
+	case JSON_PATH_STR: {
+		type = FIELD_TYPE_MAP;
+		if (field->type != FIELD_TYPE_ANY && field->type != type)
+			goto error_type_mistmatch;
+		if (field->map == NULL) {
+			field->map = json_path_hash_create(1);
+			if (field->map == NULL)
+				return -1;
+			field->type = type;
+		} else {
+			uint32_t str_hash = mh_strn_hash(part->str, part->len);
+			struct mh_strnptr_node_t *ht_record =
+				json_path_hash_get(field->map, part->str,
+						   part->len, str_hash);
+			if (ht_record != NULL) {
+				assert(ht_record->val != NULL);
+				*field_subtree = ht_record->val;
+				return 0;
+			}
+		}
+		break;
+	}
+	default:
+		unreachable();
+	}
+
+	/* Construct and insert a new record. */
+	struct tuple_field *new_field = malloc(sizeof(struct tuple_field));
+	if (new_field == NULL) {
+		diag_set(OutOfMemory, sizeof(struct tuple_field), "malloc",
+			"new_field");
+		return -1;
+	}
+	*new_field = tuple_field_default;
+	if (field->type == FIELD_TYPE_MAP) {
+		if (json_path_hash_insert(field->map, part->str, part->len,
+					  new_field) != 0) {
+			free(new_field);
+			return -1;
+		}
+	} else if (field->type == FIELD_TYPE_ARRAY) {
+		field->array[part->num - TUPLE_INDEX_BASE] = new_field;
+	}
+	*field_subtree = new_field;
+	return 0;
+
+error_type_mistmatch:
+	diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
+		tt_sprintf("%d", fieldno + TUPLE_INDEX_BASE),
+		field_type_strs[type], field_type_strs[field->type]);
+	return -1;
+}
+
+/**
+ * Delete @field_subtree object.
+ * @param field_subtree To delete.
+ */
+static void
+json_field_tree_delete(struct tuple_field *field_subtree)
+{
+	if (field_subtree->type == FIELD_TYPE_MAP &&
+	    field_subtree->map != NULL) {
+		mh_int_t i;
+		mh_foreach(field_subtree->map, i) {
+			struct tuple_field *field =
+				mh_strnptr_node(field_subtree->map, i)->val;
+			assert(field != NULL);
+			json_field_tree_delete(field);
+			free(field);
+		}
+		json_path_hash_delete(field_subtree->map);
+	} else if (field_subtree->type == FIELD_TYPE_ARRAY &&
+		   field_subtree->array != NULL) {
+		for (uint32_t i = 0; i < field_subtree->array_size; i++) {
+			struct tuple_field *field = field_subtree->array[i];
+			if (field == NULL)
+				continue;
+			json_field_tree_delete(field_subtree->array[i]);
+			free(field_subtree->array[i]);
+		}
+		free(field_subtree->array);
+	}
+}
+
+int
+tuple_field_bypass_and_init(const struct tuple_field *field, uint32_t idx,
+			    const char *tuple, const char **offset,
+			    uint32_t *field_map)
+{
+	assert(offset != NULL);
+	const char *mp_data = *offset;
+	const char *valid_type_str = NULL;
+	const char *err = NULL;
+	enum mp_type type = mp_typeof(**offset);
+	if (field->type == FIELD_TYPE_MAP) {
+		if (type != MP_MAP) {
+			valid_type_str = mp_type_strs[MP_MAP];
+			goto error_type_mistmatch;
+		}
+		const char *max_offset = *offset;
+		uint32_t max_idx = 0;
+		uint32_t count = mp_decode_map(&max_offset);
+		mh_int_t i;
+		mh_foreach(field->map, i) {
+			struct mh_strnptr_node_t *ht_record =
+				mh_strnptr_node(field->map, i);
+			struct tuple_field *leaf = ht_record->val;
+			assert(leaf != NULL);
+
+			const char *raw = *offset;
+			uint32_t map_idx = 0;
+			int rc = tuple_field_go_to_key(&raw, ht_record->str,
+						       (int)ht_record->len,
+						       &map_idx);
+			if (rc != 0 && !leaf->is_nullable) {
+				err = tt_sprintf("map doesn't contain key "
+						 "'%.*s' defined in index",
+						 ht_record->len,ht_record->str);
+				goto error_invalid_document;
+			}
+			if (rc != 0) {
+				if (field_map != NULL &&
+				    leaf->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+				    field_map[leaf->offset_slot] = 0;
+				continue;
+			}
+			if (tuple_field_bypass_and_init(leaf, idx, tuple, &raw,
+							field_map) != 0)
+				return -1;
+			max_idx = MAX(max_idx, map_idx + 1);
+			max_offset = MAX(max_offset, raw);
+		}
+		*offset = max_offset;
+		while (count-- > max_idx) {
+			mp_next(offset);
+			mp_next(offset);
+		}
+		return 0;
+	} else if (field->type == FIELD_TYPE_ARRAY) {
+		if (type != MP_ARRAY) {
+			valid_type_str = mp_type_strs[MP_ARRAY];
+			goto error_type_mistmatch;
+		}
+		uint32_t count = mp_decode_array(offset);
+		for (uint32_t i = count; i < field->array_size; i++) {
+			/*
+			 * Index fields out of document array
+			 * must be nullable.
+			 */
+			struct tuple_field *leaf = field->array[i];
+			if (leaf == NULL)
+				continue;
+			if (leaf->is_nullable) {
+				if (field_map != NULL &&
+				    leaf->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+					field_map[leaf->offset_slot] = 0;
+				continue;
+			}
+			err = tt_sprintf("array size %d is less than size of %d "
+					 "defined in index", i, i + 1);
+			goto error_invalid_document;
+		}
+		uint32_t fields = MIN(field->array_size, count);
+		for (uint32_t i = 0; i < fields; i++) {
+			if (field->array[i] == NULL) {
+				mp_next(offset);
+				continue;
+			}
+			if (tuple_field_bypass_and_init(field->array[i], idx,
+							tuple,
+							offset, field_map) != 0)
+				return -1;
+		}
+		while (count-- > fields)
+			mp_next(offset);
+		return 0;
+	}
+	/* Tree leaf field */
+	if (key_mp_type_validate(field->type, type, ER_KEY_PART_TYPE, idx,
+				 field->is_nullable) != 0) {
+		valid_type_str = field_type_strs[field->type];
+		goto error_type_mistmatch;
+	}
+	assert(offset != NULL);
+	if (field_map != NULL &&
+	    field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+		field_map[field->offset_slot] = (uint32_t) (*offset - tuple);
+	mp_next(offset);
+	return 0;
+
+error_type_mistmatch:
+	err = tt_sprintf("type mismatch: have %s, expected %s",
+			 mp_type_strs[type], valid_type_str);
+error_invalid_document:
+	assert(err != NULL);
+	char *data_buff = tt_static_buf();
+	mp_snprint(data_buff, TT_STATIC_BUF_LEN, mp_data);
+	const char *err_msg =
+		tt_sprintf("invalid field %d document content '%s': %s",
+			   idx + TUPLE_INDEX_BASE, data_buff, err);
+	diag_set(ClientError, ER_DATA_STRUCTURE_MISMATCH, err_msg);
+	return -1;
+}
+
+/**
+ * Add new JSON @path to @format.
+ * @param format Tuple format to modify.
+ * @param path String to add.
+ * @param path_len Length of @path.
+ * @param path_hash Hash of @path.
+ * @param type Type of field by @path.
+ * @param is_nullable Nullability of field by @path.
+ * @param strings Area to store unique JSON paths (optional).
+ * @param[out] leaf Pointer to leaf field.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+tuple_format_add_json_path(struct tuple_format *format, const char *path,
+			   uint32_t path_len, uint32_t path_hash,
+			   enum field_type type, bool is_nullable,
+			   char **strings, struct tuple_field **leaf)
+{
+	assert(format->path_hash != NULL);
+	/*
+	 * Get root field by index.
+	 * Path is specified in canonical form: [i]...
+	 */
+	struct json_path_parser parser;
+	struct json_path_node node;
+	json_path_parser_create(&parser, path, path_len);
+	int rc = json_path_next(&parser, &node);
+	assert(rc == 0 && node.type == JSON_PATH_NUM);
+	assert(node.num - TUPLE_INDEX_BASE < format->field_count);
+
+	/* Test if path is already registered. */
+	struct mh_strnptr_node_t *ht_record =
+		json_path_hash_get(format->path_hash, path, path_len, path_hash);
+	assert(ht_record != NULL);
+	struct tuple_field *field = ht_record->val;
+	if (unlikely(field != NULL)) {
+		/* Path has been already registered. */
+		if (field->is_nullable != is_nullable)
+			field->is_nullable = false;
+		if (field_type1_contains_type2(field->type, type)) {
+			field->type = type;
+		} else if (!field_type1_contains_type2(type, field->type)) {
+			const char *err =
+			    tt_sprintf("JSON path '%.*s' has been already "
+				       "constructed for '%s' leaf record",
+				       path_len, path,
+				       field_type_strs[field->type]);
+			diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+				 node.num, err);
+			return -1;
+		}
+		*leaf = field;
+		return 0;
+	} else if (strings != NULL) {
+		/*
+		 * Hashtable should hold memory related to format
+		 * chunk allocation.
+		 */
+		memcpy(*strings, path, path_len);
+		(*strings)[path_len] = '\0';
+		ht_record->str = *strings;
+		*strings += path_len + 1;
+	}
+
+	/*
+	 * We have to re-init parser with path string located in
+	 * format chunk.
+	 */
+	json_path_parser_create(&parser, ht_record->str + parser.offset,
+				path_len - parser.offset);
+	/* Build data path tree. */
+	uint32_t root_fieldno = node.num - TUPLE_INDEX_BASE;
+	field = &format->fields[root_fieldno];
+	while ((rc = json_path_next(&parser, &node)) == 0 &&
+		node.type != JSON_PATH_END) {
+		if (json_field_tree_append(&field, root_fieldno, &node) != 0)
+			return -1;
+	}
+	assert(rc == 0 && node.type == JSON_PATH_END);
+
+	/* Leaf record is a new object as JSON path unique. */
+	field->type = type;
+	field->is_nullable = is_nullable;
+	*leaf = field;
+	ht_record->val = field;
+	return 0;
+}
+
+/**
+ * Add a new key_part to format and initialize format tuple_field
+ * representation.
+ * @param format Format to initialize.
+ * @param fields Fields definition if any.
+ * @param fields_count Count of @fields.
+ * @param part An index part to append.
+ * @param is_sequential Does this part sequential.
+ * @param data Memory to store path strings.
+ * @param current_slot Pointer to last offset slot.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+tuple_format_add_key_part(struct tuple_format *format,
+			  const struct field_def *fields, uint32_t field_count,
+			  const struct key_part *part, bool is_sequential,
+			  char **data, int *current_slot)
+{
+	assert(part->fieldno < format->field_count);
+	struct tuple_field *field = &format->fields[part->fieldno];
+	if (part->path != NULL) {
+		field->is_key_part = true;
+		assert(!is_sequential);
+		struct tuple_field *leaf = NULL;
+		if (tuple_format_add_json_path(format, part->path,
+					       part->path_len, part->path_hash,
+					       part->type, part->is_nullable,
+					       data, &leaf) != 0)
+			return -1;
+		assert(leaf != NULL);
+		if (leaf->offset_slot == TUPLE_OFFSET_SLOT_NIL) {
+			*current_slot = *current_slot - 1;
+			leaf->offset_slot = *current_slot;
+		}
+		return 0;
+	}
+	if (part->fieldno >= field_count) {
+		field->is_nullable = part->is_nullable;
+	} else if (field->is_nullable != part->is_nullable) {
+		/*
+		 * In case of mismatch set the most
+		 * strict option for is_nullable.
+		 */
+		field->is_nullable = false;
+	}
+	/*
+	 * Check that there are no conflicts
+	 * between index part types and space
+	 * fields. If a part type is compatible
+	 * with field's one, then the part type is
+	 * more strict and the part type must be
+	 * used in tuple_format.
+	 */
+	if (field_type1_contains_type2(field->type, part->type)) {
+		field->type = part->type;
+	} else if (!field_type1_contains_type2(part->type, field->type)) {
+		int fieldno = part->fieldno + TUPLE_INDEX_BASE;
+		const char *name = part->fieldno >= field_count ?
+				   tt_sprintf("%d", fieldno) :
+				   tt_sprintf("'%s'",
+					      fields[part->fieldno].name);
+		int errcode = !field->is_key_part ?
+			      ER_FORMAT_MISMATCH_INDEX_PART :
+			      ER_INDEX_PART_TYPE_MISMATCH;
+		diag_set(ClientError, errcode, name,
+			 field_type_strs[field->type],
+			 field_type_strs[part->type]);
+		return -1;
+	}
+	field->is_key_part = true;
+	/*
+	 * In the tuple, store only offsets necessary
+	 * to access fields of non-sequential keys.
+	 * First field is always simply accessible,
+	 * so we don't store an offset for it.
+	 */
+	if (field->offset_slot == TUPLE_OFFSET_SLOT_NIL && !is_sequential &&
+	    part->fieldno > 0) {
+		*current_slot = *current_slot - 1;
+		field->offset_slot = *current_slot;
+	}
+	return 0;
+}
+
+/**
  * Extract all available type info from keys and field
  * definitions.
  */
@@ -63,12 +605,18 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 		format->fields[i].type = fields[i].type;
 		format->fields[i].offset_slot = TUPLE_OFFSET_SLOT_NIL;
 		format->fields[i].is_nullable = fields[i].is_nullable;
+		/* Don't need to init format->fields[i].map. */
+		format->fields[i].childs = NULL;
+		format->fields[i].array_size = 0;
 	}
 	/* Initialize remaining fields */
 	for (uint32_t i = field_count; i < format->field_count; i++)
 		format->fields[i] = tuple_field_default;
 
 	int current_slot = 0;
+	/* Memory allocated for JSON paths if any. */
+	char *data = (char *)format + sizeof(struct tuple_format) +
+		     format->field_count * sizeof(struct tuple_field);
 
 	/* extract field type info */
 	for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
@@ -76,65 +624,12 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 		bool is_sequential = key_def_is_sequential(key_def);
 		const struct key_part *part = key_def->parts;
 		const struct key_part *parts_end = part + key_def->part_count;
-
 		for (; part < parts_end; part++) {
-			assert(part->fieldno < format->field_count);
-			struct tuple_field *field =
-				&format->fields[part->fieldno];
-			if (part->fieldno >= field_count) {
-				field->is_nullable = part->is_nullable;
-			} else if (field->is_nullable != part->is_nullable) {
-				/*
-				 * In case of mismatch set the most
-				 * strict option for is_nullable.
-				 */
-				field->is_nullable = false;
-			}
-
-			/*
-			 * Check that there are no conflicts
-			 * between index part types and space
-			 * fields. If a part type is compatible
-			 * with field's one, then the part type is
-			 * more strict and the part type must be
-			 * used in tuple_format.
-			 */
-			if (field_type1_contains_type2(field->type,
-						       part->type)) {
-				field->type = part->type;
-			} else if (! field_type1_contains_type2(part->type,
-								field->type)) {
-				const char *name;
-				int fieldno = part->fieldno + TUPLE_INDEX_BASE;
-				if (part->fieldno >= field_count) {
-					name = tt_sprintf("%d", fieldno);
-				} else {
-					const struct field_def *def =
-						&fields[part->fieldno];
-					name = tt_sprintf("'%s'", def->name);
-				}
-				int errcode;
-				if (! field->is_key_part)
-					errcode = ER_FORMAT_MISMATCH_INDEX_PART;
-				else
-					errcode = ER_INDEX_PART_TYPE_MISMATCH;
-				diag_set(ClientError, errcode, name,
-					 field_type_strs[field->type],
-					 field_type_strs[part->type]);
+			if (tuple_format_add_key_part(format, fields,
+						      field_count, part,
+						      is_sequential, &data,
+						      &current_slot) != 0)
 				return -1;
-			}
-			field->is_key_part = true;
-			/*
-			 * In the tuple, store only offsets necessary
-			 * to access fields of non-sequential keys.
-			 * First field is always simply accessible,
-			 * so we don't store an offset for it.
-			 */
-			if (field->offset_slot == TUPLE_OFFSET_SLOT_NIL &&
-			    is_sequential == false && part->fieldno > 0) {
-
-				field->offset_slot = --current_slot;
-			}
 		}
 	}
 
@@ -201,32 +696,58 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 		   uint32_t space_field_count, struct tuple_dictionary *dict)
 {
 	uint32_t index_field_count = 0;
+	/* JSON path hashtable. */
+	struct mh_strnptr_t *path_hash = json_path_hash_create(0);
+	if (path_hash == NULL)
+		return NULL;
 	/* find max max field no */
 	for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
 		const struct key_def *key_def = keys[key_no];
 		const struct key_part *part = key_def->parts;
 		const struct key_part *pend = part + key_def->part_count;
 		for (; part < pend; part++) {
+			if (part->path != NULL &&
+			    json_path_hash_insert(path_hash, part->path,
+						  part->path_len, NULL) != 0)
+				goto error;
 			index_field_count = MAX(index_field_count,
 						part->fieldno + 1);
 		}
 	}
+	size_t extra_size = 0;
+	if (mh_size(path_hash) == 0) {
+		/* Hashtable is useless. */
+		json_path_hash_delete(path_hash);
+		path_hash = NULL;
+	} else {
+		/*
+		 * Calculate unique JSON paths count.
+		 * Path data would be copied later on
+		 * tuple_format_create routine.
+		 */
+		mh_int_t i;
+		mh_foreach(path_hash, i) {
+			struct mh_strnptr_node_t *node =
+				mh_strnptr_node(path_hash, i);
+			extra_size += node->len + 1;
+		}
+	}
 	uint32_t field_count = MAX(space_field_count, index_field_count);
 	uint32_t total = sizeof(struct tuple_format) +
-			 field_count * sizeof(struct tuple_field);
+			 field_count * sizeof(struct tuple_field) + extra_size;
 
 	struct tuple_format *format = (struct tuple_format *) malloc(total);
 	if (format == NULL) {
 		diag_set(OutOfMemory, sizeof(struct tuple_format), "malloc",
 			 "tuple format");
-		return NULL;
+		goto error;
 	}
 	if (dict == NULL) {
 		assert(space_field_count == 0);
 		format->dict = tuple_dictionary_new(NULL, 0);
 		if (format->dict == NULL) {
 			free(format);
-			return NULL;
+			goto error;
 		}
 	} else {
 		format->dict = dict;
@@ -238,13 +759,21 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 	format->index_field_count = index_field_count;
 	format->exact_field_count = 0;
 	format->min_field_count = 0;
+	format->path_hash = path_hash;
 	return format;
+error:
+	json_path_hash_delete(path_hash);
+	return NULL;
 }
 
 /** Free tuple format resources, doesn't unregister. */
 static inline void
 tuple_format_destroy(struct tuple_format *format)
 {
+	for (uint32_t i = 0; i < format->field_count; i++)
+		json_field_tree_delete(&format->fields[i]);
+	if (format->path_hash != NULL)
+		json_path_hash_delete(format->path_hash);
 	tuple_dictionary_unref(format->dict);
 }
 
@@ -329,21 +858,61 @@ tuple_format_dup(struct tuple_format *src)
 {
 	uint32_t total = sizeof(struct tuple_format) +
 			 src->field_count * sizeof(struct tuple_field);
+	if (src->path_hash != NULL) {
+		mh_int_t i;
+		mh_foreach(src->path_hash, i)
+			total += mh_strnptr_node(src->path_hash, i)->len + 1;
+	}
 	struct tuple_format *format = (struct tuple_format *) malloc(total);
 	if (format == NULL) {
 		diag_set(OutOfMemory, total, "malloc", "tuple format");
 		return NULL;
 	}
 	memcpy(format, src, total);
+
+	/* Fill with NULLs for normal destruction on error. */
+	format->path_hash = NULL;
+	for (uint32_t i = 0; i < format->field_count; i++) {
+		format->fields[i].childs = NULL;
+		format->fields[i].array_size = 0;
+	}
+	if (src->path_hash != NULL) {
+		mh_int_t i;
+		format->path_hash =
+			json_path_hash_create(mh_size(src->path_hash));
+		if (format->path_hash == NULL)
+			goto error;
+		mh_foreach(src->path_hash, i) {
+			struct mh_strnptr_node_t *node =
+				mh_strnptr_node(src->path_hash, i);
+			/* Path data has been already copied. */
+			char *path = (char *)format + (node->str - (char *)src);
+			if (json_path_hash_insert(format->path_hash, path,
+						  node->len, NULL) != 0)
+				goto error;
+			/* Store source leaf field offset_slot.  */
+			struct tuple_field *leaf = node->val;
+			int32_t offset_slot = leaf->offset_slot;
+			uint32_t path_hash = mh_strn_hash(path, node->len);
+			if (tuple_format_add_json_path(format, path, node->len,
+						       path_hash, leaf->type,
+						       leaf->is_nullable, NULL,
+						       &leaf) != 0)
+				goto error;
+			/* Store offset_slot in a new leaf record. */
+			assert(leaf != NULL);
+			leaf->offset_slot = offset_slot;
+		}
+	}
 	tuple_dictionary_ref(format->dict);
 	format->id = FORMAT_ID_NIL;
 	format->refs = 0;
-	if (tuple_format_register(format) != 0) {
-		tuple_format_destroy(format);
-		free(format);
-		return NULL;
-	}
-	return format;
+	if (tuple_format_register(format) == 0)
+		return format;
+error:
+	tuple_format_destroy(format);
+	free(format);
+	return NULL;
 }
 
 /** @sa declaration for details. */
@@ -372,18 +941,10 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
 		return -1;
 	}
 
-	/* first field is simply accessible, so we do not store offset to it */
-	enum mp_type mp_type = mp_typeof(*pos);
+	uint32_t i = 0;
+	enum mp_type mp_type;
 	const struct tuple_field *field = &format->fields[0];
-	if (key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE,
-				 TUPLE_INDEX_BASE, field->is_nullable))
-		return -1;
-	mp_next(&pos);
-	/* other fields...*/
-	++field;
-	uint32_t i = 1;
-	uint32_t defined_field_count = MIN(field_count, format->field_count);
-	if (field_count < format->index_field_count) {
+	if (field_count < format->index_field_count || field->childs != NULL) {
 		/*
 		 * Nullify field map to be able to detect by 0,
 		 * which key fields are absent in tuple_field().
@@ -391,6 +952,20 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
 		memset((char *)field_map - format->field_map_size, 0,
 		       format->field_map_size);
 	}
+	if (field->childs == NULL) {
+		/*
+		 * First field is simply accessible, do not store
+		 * offset to it.
+		 */
+		mp_type = mp_typeof(*pos);
+		if (key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE,
+					 TUPLE_INDEX_BASE, field->is_nullable))
+			return -1;
+		mp_next(&pos);
+		++field;
+		++i;
+	}
+	uint32_t defined_field_count = MIN(field_count, format->field_count);
 	for (; i < defined_field_count; ++i, ++field) {
 		mp_type = mp_typeof(*pos);
 		if (key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE,
@@ -400,8 +975,12 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
 		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
 			field_map[field->offset_slot] =
 				(uint32_t) (pos - tuple);
-		}
-		mp_next(&pos);
+		} else if (field->childs != NULL &&
+			   tuple_field_bypass_and_init(field, i, tuple, &pos,
+						       field_map) != 0)
+			return -1;
+		if (field->childs == NULL)
+			mp_next(&pos);
 	}
 	return 0;
 }
@@ -507,55 +1086,99 @@ tuple_field_go_to_index(const char **field, uint64_t index)
 	return -1;
 }
 
-/**
- * Propagate @a field to MessagePack(field)[key].
- * @param[in][out] field Field to propagate.
- * @param key Key to propagate to.
- * @param len Length of @a key.
- *
- * @retval  0 Success, the index was found.
- * @retval -1 Not found.
- */
-static inline int
-tuple_field_go_to_key(const char **field, const char *key, int len)
+const char *
+tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
+			const uint32_t *field_map, struct key_part *part)
 {
-	enum mp_type type = mp_typeof(**field);
-	if (type != MP_MAP)
-		return -1;
-	uint64_t count = mp_decode_map(field);
-	for (; count > 0; --count) {
-		type = mp_typeof(**field);
-		if (type == MP_STR) {
-			uint32_t value_len;
-			const char *value = mp_decode_str(field, &value_len);
-			if (value_len == (uint)len &&
-			    memcmp(value, key, len) == 0)
-				return 0;
-		} else {
-			/* Skip key. */
-			mp_next(field);
-		}
-		/* Skip value. */
-		mp_next(field);
+	if (likely(part->path == NULL))
+		return tuple_field_raw(format, data, field_map, part->fieldno);
+
+	struct mh_strnptr_node_t *ht_record = NULL;
+	int32_t offset_slot;
+	if (format->path_hash != NULL &&
+	    (ht_record = json_path_hash_get(format->path_hash, part->path,
+					    part->path_len,
+					    part->path_hash)) != NULL) {
+		struct tuple_field *field = ht_record->val;
+		assert(field != NULL);
+		offset_slot = field->offset_slot;
+	} else {
+		/*
+		 * Legacy tuple having no field map for
+		 * JSON index.
+		 */
+		uint32_t path_hash =
+			field_name_hash(part->path, part->path_len);
+		const char *raw = NULL;
+		if (tuple_field_raw_by_path(format, data, field_map,
+					    part->path, part->path_len,
+					    path_hash, &raw) != 0)
+			raw = NULL;
+		return raw;
 	}
-	return -1;
+	assert(offset_slot < 0);
+	assert(-offset_slot * sizeof(uint32_t) <= format->field_map_size);
+	if (unlikely(field_map[offset_slot] == 0))
+		return NULL;
+	return data + field_map[offset_slot];
 }
 
-const char *
-tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
-			const uint32_t *field_map, struct key_part *part)
+int
+tuple_field_dig_with_parser(struct json_path_parser *parser, const char **field)
 {
-	return tuple_field_raw(format, data, field_map, part->fieldno);
+	int rc;
+	struct json_path_node node;
+	while ((rc = json_path_next(parser, &node)) == 0) {
+		uint32_t dummy;
+		switch(node.type) {
+			case JSON_PATH_NUM:
+				rc = tuple_field_go_to_index(field, node.num);
+				break;
+			case JSON_PATH_STR:
+				rc = tuple_field_go_to_key(field, node.str,
+							   node.len, &dummy);
+				break;
+			default:
+				assert(node.type == JSON_PATH_END);
+				return 0;
+		}
+		if (rc != 0) {
+			*field = NULL;
+			return 0;
+		}
+	}
+	return rc;
 }
 
 int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
+tuple_field_raw_by_path(const struct tuple_format *format, const char *tuple,
                         const uint32_t *field_map, const char *path,
                         uint32_t path_len, uint32_t path_hash,
                         const char **field)
 {
 	assert(path_len > 0);
 	uint32_t fieldno;
+	if (format->path_hash != NULL) {
+		/*
+		 * The path hash for format->path_hash hashtable
+		 * may may be different from path_hash specified
+		 * as function argument.
+		 */
+		struct mh_strnptr_node_t *ht_record =
+			json_path_hash_get(format->path_hash, path, path_len,
+					   mh_strn_hash(path, path_len));
+		if (ht_record != NULL) {
+			struct tuple_field *leaf = ht_record->val;
+			assert(leaf != NULL);
+			int32_t offset_slot = leaf->offset_slot;
+			assert(offset_slot != TUPLE_OFFSET_SLOT_NIL);
+			if (likely(field_map[offset_slot] != 0))
+				*field = tuple + field_map[offset_slot];
+			else
+				*field = NULL;
+			return 0;
+		}
+	}
 	/*
 	 * It is possible, that a field has a name as
 	 * well-formatted JSON. For example 'a.b.c.d' or '[1]' can
@@ -611,23 +1234,9 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 		*field = NULL;
 		return 0;
 	}
-	while ((rc = json_path_next(&parser, &node)) == 0) {
-		switch(node.type) {
-		case JSON_PATH_NUM:
-			rc = tuple_field_go_to_index(field, node.num);
-			break;
-		case JSON_PATH_STR:
-			rc = tuple_field_go_to_key(field, node.str, node.len);
-			break;
-		default:
-			assert(node.type == JSON_PATH_END);
-			return 0;
-		}
-		if (rc != 0) {
-			*field = NULL;
-			return 0;
-		}
-	}
+	rc = tuple_field_dig_with_parser(&parser, field);
+	if (rc == 0)
+		return 0;
 error:
 	assert(rc > 0);
 	diag_set(ClientError, ER_ILLEGAL_PARAMS,
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index 0ffdf7d..df500db 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -63,6 +63,8 @@ enum { TUPLE_OFFSET_SLOT_NIL = INT32_MAX };
 
 struct tuple;
 struct tuple_format;
+struct json_path_parser;
+struct mh_strnptr_t;
 
 /** Engine-specific tuple format methods. */
 struct tuple_format_vtab {
@@ -108,6 +110,21 @@ struct tuple_field {
 	bool is_key_part;
 	/** True, if a field can store NULL. */
 	bool is_nullable;
+	/** Tree child records. Must at the end of struct */
+	union {
+		/** Array of fields. */
+		struct {
+			struct tuple_field **array;
+			uint32_t array_size;
+		};
+		/** Hashtable: path -> tuple_field. */
+		struct mh_strnptr_t *map;
+		/**
+		 * Auxiliary pointer to test if field has
+		 * JSON path subtree.
+		 */
+		void *childs;
+	};
 };
 
 /**
@@ -161,6 +178,8 @@ struct tuple_format {
 	 * Shared names storage used by all formats of a space.
 	 */
 	struct tuple_dictionary *dict;
+	/** JSON path hash table. */
+	struct mh_strnptr_t *path_hash;
 	/* Formats of the fields */
 	struct tuple_field fields[0];
 };
@@ -388,7 +407,7 @@ tuple_field_raw(const struct tuple_format *format, const char *tuple,
  * @retval     NULL No field with @a name.
  */
 static inline const char *
-tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
+tuple_field_raw_by_name(const struct tuple_format *format, const char *tuple,
 			const uint32_t *field_map, const char *name,
 			uint32_t name_len, uint32_t name_hash)
 {
@@ -413,11 +432,51 @@ tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
  * @retval -1 Error in JSON path.
  */
 int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
+tuple_field_raw_by_path(const struct tuple_format *format, const char *tuple,
                         const uint32_t *field_map, const char *path,
                         uint32_t path_len, uint32_t path_hash,
                         const char **field);
 
+/**
+ * Retrieve document data @field with initialized @parser.
+ * @param parser JSON parser.
+ * @param[in, out] field Tuple field to lookup.
+ * @retval 0 On success.
+ * @retval > 0 On error in path been used to initialize @parser.
+ */
+int
+tuple_field_dig_with_parser(struct json_path_parser *parser,
+			    const char **field);
+
+/**
+ * Get @hashtable record by key @path, @path_len.
+ * @param hashtable Storage to lookup.
+ * @param path Path string.
+ * @param path_len Length of @path.
+ * @param path_hash Hash of @path.
+ * @retval NULL On nothing found.
+ * @retval not NULL Leaf field pointer for registered path.
+ */
+struct mh_strnptr_node_t *
+json_path_hash_get(struct mh_strnptr_t *hashtable, const char *path,
+		   uint32_t path_len, uint32_t path_hash);
+
+/**
+ * Observe JSON path tree in @field comparing with @tuple
+ * structure. Initialize field map if specified.
+ * @param field Field to use on initialization.
+ * @param idx Root field index to emmit correct error.
+ * @param tuple Source raw data.
+ * @param offset Document field offset to process.
+ * @param field_map Field map to initialize (optional).
+ * @retval 0 On success.
+ * @retval -1 On error.
+ */
+int
+tuple_field_bypass_and_init(const struct tuple_field *field, uint32_t idx,
+			    const char *tuple, const char **offset,
+			    uint32_t *field_map);
+
 #if defined(__cplusplus)
 } /* extern "C" */
 #endif /* defined(__cplusplus) */
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index b394804..8ede290 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -213,7 +213,7 @@ static const hasher_signature hash_arr[] = {
 
 #undef HASHER
 
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 uint32_t
 tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def);
 
@@ -222,7 +222,7 @@ key_hash_slowpath(const char *key, struct key_def *key_def);
 
 void
 tuple_hash_func_set(struct key_def *key_def) {
-	if (key_def->is_nullable)
+	if (key_def->is_nullable || key_def->has_json_paths)
 		goto slowpath;
 	/*
 	 * Check that key_def defines sequential a key without holes
@@ -256,10 +256,17 @@ tuple_hash_func_set(struct key_def *key_def) {
 	}
 
 slowpath:
-	if (key_def->has_optional_parts)
-		key_def->tuple_hash = tuple_hash_slowpath<true>;
-	else
-		key_def->tuple_hash = tuple_hash_slowpath<false>;
+	if (key_def->has_optional_parts) {
+		if (key_def->has_json_paths)
+			key_def->tuple_hash = tuple_hash_slowpath<true, true>;
+		else
+			key_def->tuple_hash = tuple_hash_slowpath<true, false>;
+	} else {
+		if (key_def->has_json_paths)
+			key_def->tuple_hash = tuple_hash_slowpath<false, true>;
+		else
+			key_def->tuple_hash = tuple_hash_slowpath<false, false>;
+	}
 	key_def->key_hash = key_hash_slowpath;
 }
 
@@ -319,10 +326,11 @@ tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry, const struct tuple *tuple,
 	return tuple_hash_field(ph1, pcarry, &field, part->coll);
 }
 
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 uint32_t
 tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	uint32_t h = HASH_SEED;
 	uint32_t carry = 0;
@@ -331,9 +339,13 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 	struct tuple_format *format = tuple_format(tuple);
 	const char *tuple_raw = tuple_data(tuple);
 	const uint32_t *field_map = tuple_field_map(tuple);
-	const char *field =
-		tuple_field_by_part_raw(format, tuple_raw, field_map,
-					key_def->parts);
+	const char *field;
+	if (!has_json_paths) {
+		field = tuple_field(tuple, prev_fieldno);
+	} else {
+		field = tuple_field_by_part_raw(format, tuple_raw, field_map,
+						key_def->parts);
+	}
 	const char *end = (char *)tuple + tuple_size(tuple);
 	if (has_optional_parts && field == NULL) {
 		total_size += tuple_hash_null(&h, &carry);
@@ -347,9 +359,18 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 		 * need of tuple_field
 		 */
 		if (prev_fieldno + 1 != key_def->parts[part_id].fieldno) {
-			struct key_part *part = &key_def->parts[part_id];
-			field = tuple_field_by_part_raw(format, tuple_raw,
-							field_map, part);
+			if (!has_json_paths) {
+				field = tuple_field(tuple,
+						    key_def->parts[part_id].
+						    fieldno);
+			} else {
+				struct key_part *part =
+					&key_def->parts[part_id];
+				field = tuple_field_by_part_raw(format,
+								tuple_raw,
+								field_map,
+								part);
+			}
 		}
 		if (has_optional_parts && (field == NULL || field >= end)) {
 			total_size += tuple_hash_null(&h, &carry);
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index b590487..073466f 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -982,6 +982,11 @@ vinyl_index_def_change_requires_rebuild(struct index *index,
 			return true;
 		if (!field_type1_contains_type2(new_part->type, old_part->type))
 			return true;
+		if (old_part->path_len != new_part->path_len)
+			return true;
+		if (memcmp(old_part->path, new_part->path,
+			   old_part->path_len) != 0)
+			return true;
 	}
 	return false;
 }
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index fc8ede5..f396705 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -711,7 +711,8 @@ vy_log_record_dup(struct region *pool, const struct vy_log_record *src)
 				 "struct key_part_def");
 			goto err;
 		}
-		key_def_dump_parts(src->key_def, dst->key_parts);
+		if (key_def_dump_parts(pool, src->key_def, dst->key_parts) != 0)
+			goto err;
 		dst->key_part_count = src->key_def->part_count;
 		dst->key_def = NULL;
 	}
diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
index de5ad31..b55933f 100644
--- a/src/box/vy_lsm.c
+++ b/src/box/vy_lsm.c
@@ -36,6 +36,7 @@
 #include <sys/stat.h>
 #include <sys/types.h>
 #include <small/mempool.h>
+#include <assoc.h>
 
 #include "diag.h"
 #include "errcode.h"
diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c
index 7b704b8..9d5e220 100644
--- a/src/box/vy_point_lookup.c
+++ b/src/box/vy_point_lookup.c
@@ -196,8 +196,6 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx,
 		const struct vy_read_view **rv,
 		struct tuple *key, struct tuple **ret)
 {
-	assert(tuple_field_count(key) >= lsm->cmp_def->part_count);
-
 	*ret = NULL;
 	double start_time = ev_monotonic_now(loop());
 	int rc = 0;
diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c
index 8018dee..e06305e 100644
--- a/src/box/vy_stmt.c
+++ b/src/box/vy_stmt.c
@@ -44,6 +44,7 @@
 #include "tuple_format.h"
 #include "xrow.h"
 #include "fiber.h"
+#include "assoc.h"
 
 /**
  * Statement metadata keys.
@@ -330,6 +331,71 @@ vy_stmt_replace_from_upsert(const struct tuple *upsert)
 	return replace;
 }
 
+static void
+vy_stmt_msgpack_build(struct tuple_field *field, char *tuple,
+		      uint32_t *field_map, char **offset, bool write_data,
+		      struct mh_i64ptr_t *fields_iov_ht)
+{
+	if (field->type == FIELD_TYPE_ARRAY) {
+		if (write_data)
+			*offset = mp_encode_array(*offset, field->array_size);
+		else
+			*offset += mp_sizeof_array(field->array_size);
+		for (uint32_t i = 0; i < field->array_size; i++) {
+			if (field->array[i] == NULL) {
+				if (write_data)
+					*offset = mp_encode_nil(*offset);
+				else
+					*offset += mp_sizeof_nil();
+				continue;
+			}
+			vy_stmt_msgpack_build(field->array[i], tuple, field_map,
+					      offset, write_data,
+					      fields_iov_ht);
+		}
+		return;
+	} else if (field->type == FIELD_TYPE_MAP) {
+		if (write_data)
+			*offset = mp_encode_map(*offset, mh_size(field->map));
+		else
+			*offset += mp_sizeof_map(mh_size(field->map));
+		mh_int_t i;
+		mh_foreach(field->map, i) {
+			struct mh_strnptr_node_t *node =
+				mh_strnptr_node(field->map, i);
+			assert(node);
+			if (write_data) {
+				*offset = mp_encode_str(*offset, node->str,
+							node->len);
+			} else {
+				*offset += mp_sizeof_str(node->len);
+			}
+			vy_stmt_msgpack_build(node->val, tuple, field_map,
+					      offset, write_data,
+					      fields_iov_ht);
+		}
+		return;
+	}
+
+	mh_int_t k = mh_i64ptr_find(fields_iov_ht, (uint64_t)field, NULL);
+	struct iovec *iov = k != mh_end(fields_iov_ht) ?
+			    mh_i64ptr_node(fields_iov_ht, k)->val : NULL;
+	if (iov == NULL) {
+		if (write_data)
+			*offset = mp_encode_nil(*offset);
+		else
+			*offset += mp_sizeof_nil();
+	} else {
+		if (write_data) {
+			uint32_t data_offset = *offset - tuple;
+			memcpy(*offset, iov->iov_base, iov->iov_len);
+			if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+				field_map[field->offset_slot] = data_offset;
+		}
+		*offset += iov->iov_len;
+	}
+}
+
 static struct tuple *
 vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
 			       const struct key_def *cmp_def,
@@ -338,51 +404,79 @@ vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
 	/* UPSERT can't be surrogate. */
 	assert(type != IPROTO_UPSERT);
 	struct region *region = &fiber()->gc;
+	struct tuple *stmt = NULL;
 
 	uint32_t field_count = format->index_field_count;
-	struct iovec *iov = region_alloc(region, sizeof(*iov) * field_count);
+	uint32_t part_count = mp_decode_array(&key);
+	assert(part_count == cmp_def->part_count);
+	struct iovec *iov = region_alloc(region, sizeof(*iov) * part_count);
 	if (iov == NULL) {
-		diag_set(OutOfMemory, sizeof(*iov) * field_count,
-			 "region", "iov for surrogate key");
+		diag_set(OutOfMemory, sizeof(*iov) * part_count, "region",
+			"iov for surrogate key");
 		return NULL;
 	}
-	memset(iov, 0, sizeof(*iov) * field_count);
-	uint32_t part_count = mp_decode_array(&key);
-	assert(part_count == cmp_def->part_count);
-	assert(part_count <= field_count);
-	uint32_t nulls_count = field_count - cmp_def->part_count;
-	uint32_t bsize = mp_sizeof_array(field_count) +
-			 mp_sizeof_nil() * nulls_count;
-	for (uint32_t i = 0; i < part_count; ++i) {
-		const struct key_part *part = &cmp_def->parts[i];
+	struct mh_i64ptr_t *fields_iov_ht = mh_i64ptr_new();
+	if (fields_iov_ht == NULL) {
+		diag_set(OutOfMemory, sizeof(struct mh_i64ptr_t),
+			 "mh_i64ptr_new", "fields_iov_ht");
+		return NULL;
+	}
+	if (mh_i64ptr_reserve(fields_iov_ht, part_count, NULL) != 0) {
+		diag_set(OutOfMemory, part_count, "mh_i64ptr_reserve",
+			 "fields_iov_ht");
+		goto end;
+	}
+	uint32_t bsize = mp_sizeof_array(field_count);
+	uint32_t nulls_count = field_count;
+	memset(iov, 0, sizeof(*iov) * part_count);
+	const struct key_part *part = cmp_def->parts;
+	for (uint32_t i = 0; i < part_count; ++i, ++part) {
 		assert(part->fieldno < field_count);
 		const char *svp = key;
-		iov[part->fieldno].iov_base = (char *) key;
+		iov[i].iov_base = (char *) key;
 		mp_next(&key);
-		iov[part->fieldno].iov_len = key - svp;
-		bsize += key - svp;
+		iov[i].iov_len = key - svp;
+		struct tuple_field *field;
+		if (part->path == NULL) {
+			field = &format->fields[part->fieldno];
+			--nulls_count;
+		} else {
+			struct mh_strnptr_node_t *ht_record =
+				json_path_hash_get(format->path_hash,
+						   part->path, part->path_len,
+						   part->path_hash);
+			assert(ht_record != NULL);
+			field = ht_record->val;
+			assert(field != NULL);
+		}
+		struct mh_i64ptr_node_t node = {(uint64_t)field, &iov[i]};
+		mh_int_t k = mh_i64ptr_put(fields_iov_ht, &node, NULL, NULL);
+		if (k == mh_end(fields_iov_ht))
+			goto end;
+	}
+	for (uint32_t i = 0; i < field_count; ++i) {
+		char *data = NULL;
+		vy_stmt_msgpack_build(&format->fields[i], NULL, NULL, &data,
+				      false, fields_iov_ht);
+		bsize += data - (char *)NULL;
 	}
 
-	struct tuple *stmt = vy_stmt_alloc(format, bsize);
+	stmt = vy_stmt_alloc(format, bsize);
 	if (stmt == NULL)
-		return NULL;
+		goto end;
 
 	char *raw = (char *) tuple_data(stmt);
 	uint32_t *field_map = (uint32_t *) raw;
 	char *wpos = mp_encode_array(raw, field_count);
 	for (uint32_t i = 0; i < field_count; ++i) {
-		const struct tuple_field *field = &format->fields[i];
-		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
-			field_map[field->offset_slot] = wpos - raw;
-		if (iov[i].iov_base == NULL) {
-			wpos = mp_encode_nil(wpos);
-		} else {
-			memcpy(wpos, iov[i].iov_base, iov[i].iov_len);
-			wpos += iov[i].iov_len;
-		}
+		vy_stmt_msgpack_build(&format->fields[i], raw, field_map, &wpos,
+				      true, fields_iov_ht);
 	}
-	assert(wpos == raw + bsize);
+	assert(wpos <= raw + bsize);
 	vy_stmt_set_type(stmt, type);
+
+end:
+	mh_i64ptr_delete(fields_iov_ht);
 	return stmt;
 }
 
diff --git a/test/box/misc.result b/test/box/misc.result
index 4ee4797..b1fda78 100644
--- a/test/box/misc.result
+++ b/test/box/misc.result
@@ -350,7 +350,7 @@ t;
   - 'box.error.CANT_CREATE_COLLATION : 150'
   - 'box.error.USER_EXISTS : 46'
   - 'box.error.WAL_IO : 40'
-  - 'box.error.PROC_RET : 21'
+  - 'box.error.RTREE_RECT : 101'
   - 'box.error.PRIV_GRANTED : 89'
   - 'box.error.CREATE_SPACE : 9'
   - 'box.error.GRANT : 88'
@@ -361,7 +361,7 @@ t;
   - 'box.error.VINYL_MAX_TUPLE_SIZE : 139'
   - 'box.error.LOAD_FUNCTION : 99'
   - 'box.error.INVALID_XLOG : 74'
-  - 'box.error.READ_VIEW_ABORTED : 130'
+  - 'box.error.PRIV_NOT_GRANTED : 91'
   - 'box.error.TRANSACTION_CONFLICT : 97'
   - 'box.error.GUEST_USER_PASSWORD : 96'
   - 'box.error.PROC_C : 102'
@@ -371,8 +371,8 @@ t;
   - 'box.error.DROP_FUNCTION : 71'
   - 'box.error.CFG : 59'
   - 'box.error.NO_SUCH_FIELD : 37'
-  - 'box.error.CONNECTION_TO_SELF : 117'
-  - 'box.error.FUNCTION_MAX : 54'
+  - 'box.error.MORE_THAN_ONE_TUPLE : 41'
+  - 'box.error.PROC_LUA : 32'
   - 'box.error.ILLEGAL_PARAMS : 1'
   - 'box.error.PARTIAL_KEY : 136'
   - 'box.error.SAVEPOINT_NO_TRANSACTION : 114'
@@ -400,34 +400,35 @@ t;
   - 'box.error.UPDATE_ARG_TYPE : 26'
   - 'box.error.CROSS_ENGINE_TRANSACTION : 81'
   - 'box.error.FORMAT_MISMATCH_INDEX_PART : 27'
-  - 'box.error.FUNCTION_TX_ACTIVE : 30'
   - 'box.error.injection : table: <address>
-  - 'box.error.ITERATOR_TYPE : 72'
+  - 'box.error.FUNCTION_TX_ACTIVE : 30'
+  - 'box.error.IDENTIFIER : 70'
+  - 'box.error.TRANSACTION_YIELD : 154'
   - 'box.error.NO_SUCH_ENGINE : 57'
   - 'box.error.COMMIT_IN_SUB_STMT : 122'
-  - 'box.error.TRANSACTION_YIELD : 154'
-  - 'box.error.UNSUPPORTED : 5'
-  - 'box.error.LAST_DROP : 15'
+  - 'box.error.RELOAD_CFG : 58'
   - 'box.error.SPACE_FIELD_IS_DUPLICATE : 149'
+  - 'box.error.LAST_DROP : 15'
+  - 'box.error.SEQUENCE_OVERFLOW : 147'
   - 'box.error.DECOMPRESSION : 124'
   - 'box.error.CREATE_SEQUENCE : 142'
   - 'box.error.CREATE_USER : 43'
-  - 'box.error.SEQUENCE_OVERFLOW : 147'
+  - 'box.error.FUNCTION_MAX : 54'
   - 'box.error.INSTANCE_UUID_MISMATCH : 66'
-  - 'box.error.RELOAD_CFG : 58'
+  - 'box.error.TUPLE_FORMAT_LIMIT : 16'
   - 'box.error.SYSTEM : 115'
   - 'box.error.KEY_PART_IS_TOO_LONG : 118'
-  - 'box.error.MORE_THAN_ONE_TUPLE : 41'
   - 'box.error.TRUNCATE_SYSTEM_SPACE : 137'
-  - 'box.error.NO_SUCH_SAVEPOINT : 61'
   - 'box.error.VY_QUOTA_TIMEOUT : 135'
-  - 'box.error.PRIV_NOT_GRANTED : 91'
+  - 'box.error.NO_SUCH_SAVEPOINT : 61'
+  - 'box.error.PROTOCOL : 104'
+  - 'box.error.READ_VIEW_ABORTED : 130'
   - 'box.error.WRONG_INDEX_OPTIONS : 108'
   - 'box.error.INVALID_VYLOG_FILE : 133'
   - 'box.error.INDEX_FIELD_COUNT_LIMIT : 127'
-  - 'box.error.BEFORE_REPLACE_RET : 53'
+  - 'box.error.DATA_STRUCTURE_MISMATCH : 55'
   - 'box.error.USER_MAX : 56'
-  - 'box.error.INVALID_MSGPACK : 20'
+  - 'box.error.BEFORE_REPLACE_RET : 53'
   - 'box.error.TUPLE_NOT_ARRAY : 22'
   - 'box.error.KEY_PART_COUNT : 31'
   - 'box.error.ALTER_SPACE : 12'
@@ -436,47 +437,47 @@ t;
   - 'box.error.DROP_SEQUENCE : 144'
   - 'box.error.INVALID_XLOG_ORDER : 76'
   - 'box.error.UNKNOWN_REQUEST_TYPE : 48'
-  - 'box.error.PROC_LUA : 32'
+  - 'box.error.PROC_RET : 21'
   - 'box.error.SUB_STMT_MAX : 121'
   - 'box.error.ROLE_NOT_GRANTED : 92'
   - 'box.error.SPACE_EXISTS : 10'
-  - 'box.error.UPDATE_INTEGER_OVERFLOW : 95'
+  - 'box.error.UNSUPPORTED : 5'
   - 'box.error.MIN_FIELD_COUNT : 39'
   - 'box.error.NO_SUCH_SPACE : 36'
   - 'box.error.WRONG_INDEX_PARTS : 107'
   - 'box.error.REPLICASET_UUID_MISMATCH : 63'
   - 'box.error.UPDATE_FIELD : 29'
   - 'box.error.INDEX_EXISTS : 85'
-  - 'box.error.SPLICE : 25'
+  - 'box.error.DROP_SPACE : 11'
   - 'box.error.COMPRESSION : 119'
   - 'box.error.INVALID_ORDER : 68'
-  - 'box.error.UNKNOWN : 0'
+  - 'box.error.SPLICE : 25'
   - 'box.error.NO_SUCH_GROUP : 155'
-  - 'box.error.TUPLE_FORMAT_LIMIT : 16'
+  - 'box.error.INVALID_MSGPACK : 20'
   - 'box.error.DROP_PRIMARY_KEY : 17'
   - 'box.error.NULLABLE_PRIMARY : 152'
   - 'box.error.NO_SUCH_SEQUENCE : 145'
   - 'box.error.INJECTION : 8'
   - 'box.error.INVALID_UUID : 64'
-  - 'box.error.IDENTIFIER : 70'
+  - 'box.error.NO_SUCH_ROLE : 82'
   - 'box.error.TIMEOUT : 78'
+  - 'box.error.ITERATOR_TYPE : 72'
   - 'box.error.REPLICA_MAX : 73'
-  - 'box.error.NO_SUCH_ROLE : 82'
-  - 'box.error.DROP_SPACE : 11'
+  - 'box.error.UNKNOWN : 0'
   - 'box.error.MISSING_REQUEST_FIELD : 69'
   - 'box.error.MISSING_SNAPSHOT : 93'
   - 'box.error.WRONG_SPACE_OPTIONS : 111'
   - 'box.error.READONLY : 7'
-  - 'box.error.RTREE_RECT : 101'
+  - 'box.error.UPDATE_INTEGER_OVERFLOW : 95'
   - 'box.error.UPSERT_UNIQUE_SECONDARY_KEY : 105'
   - 'box.error.NO_CONNECTION : 77'
   - 'box.error.UNSUPPORTED_PRIV : 98'
   - 'box.error.WRONG_SCHEMA_VERSION : 109'
   - 'box.error.ROLLBACK_IN_SUB_STMT : 123'
-  - 'box.error.PROTOCOL : 104'
-  - 'box.error.INVALID_XLOG_TYPE : 125'
-  - 'box.error.INDEX_PART_TYPE_MISMATCH : 24'
   - 'box.error.UNSUPPORTED_INDEX_FEATURE : 112'
+  - 'box.error.CONNECTION_TO_SELF : 117'
+  - 'box.error.INDEX_PART_TYPE_MISMATCH : 24'
+  - 'box.error.INVALID_XLOG_TYPE : 125'
 ...
 test_run:cmd("setopt delimiter ''");
 ---
diff --git a/test/engine/tuple.result b/test/engine/tuple.result
index 35c700e..b74bb23 100644
--- a/test/engine/tuple.result
+++ b/test/engine/tuple.result
@@ -954,6 +954,393 @@ type(tuple:tomap().fourth)
 s:drop()
 ---
 ...
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO["fname"]'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': same key
+    part is indexed twice'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+---
+- error: 'Wrong index options (field 2): ''path'' must be string'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'field.FIO.fname'}}})
+---
+- error: 'Wrong index options (field 2): invalid JSON path: first part should be defined
+    as array index'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = '[3].FIO'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+    ''map'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[3][1]'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+    ''array'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+---
+- error: Field 3 has type 'map' in one index, but type 'string' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3][1].sname'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+---
+- error: Field 3 has type 'map' in one index, but type 'array' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[2].FIO.fname'}}})
+---
+- error: 'Wrong index options (field 2): invalid JSON path: first part refers to invalid
+    field'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO....fname'}}})
+---
+- error: 'Wrong index options (field 3): invalid JSON path ''[3].FIO....fname'': path
+    has invalid structure (error at position 9)'
+...
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+...
+assert(idx ~= nil)
+---
+- true
+...
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+    ''666'': type mismatch: have unsigned, expected map'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+    ''666'': type mismatch: have unsigned, expected string'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+    ''{"fname": "James"}'': map doesn''t contain key ''sname'' defined in index'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+  4, 5]
+...
+idx:select()
+---
+- - [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+  - [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+    4, 5]
+...
+idx:min()
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:max()
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+  4, 5]
+...
+s:drop()
+---
+...
+s = box.schema.create_space('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[1][2]'}
+---
+...
+pk = s:create_index('pk', {parts = parts})
+---
+...
+s:insert{{1, 2}, 3}
+---
+- [[1, 2], 3]
+...
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+---
+...
+s:get(2)
+---
+- [[1, 2], 8]
+...
+s:drop()
+---
+...
+-- Create index on space with data
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk = s:create_index('primary', { type = 'tree' })
+---
+...
+s:insert{1, 7, {town = 'London', FIO = 1234}, 4, 5}
+---
+- [1, 7, {'town': 'London', 'FIO': 1234}, 4, 5]
+...
+s:insert{2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [2, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+---
+- [4, 7, {'town': 'London', 'FIO': [1, 2, 3]}, 4, 5]
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+    ''1234'': type mismatch: have unsigned, expected map'
+...
+_ = s:delete(1)
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+_ = s:delete(2)
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+    ''[1, 2, 3]'': type mismatch: have array, expected map'
+...
+_ = s:delete(4)
+---
+...
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]', is_nullable = true}, {3, 'str', path = '[3]["FIO"]["sname"]'}, {3, 'str', path = '[3]["FIO"]["extra"]', is_nullable = true}}})
+---
+...
+assert(idx ~= nil)
+---
+- true
+...
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = '[3]["FIO"]["fname"]'}}})
+---
+- error: 'Wrong index options (field 3): JSON path ''[3]["FIO"]["fname"]'' has been
+    already constructed for ''string'' leaf record'
+...
+idx2 = s:create_index('test2', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+---
+...
+assert(idx2 ~= nil)
+---
+- true
+...
+t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+---
+...
+-- Test field_map in tuple speed-up access by indexed path.
+t["[3][\"FIO\"][\"fname\"]"]
+---
+- Agent
+...
+idx:select()
+---
+- - [5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4, 5]
+  - [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:min()
+---
+- [5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4, 5]
+...
+idx:max()
+---
+- [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:drop()
+---
+...
+s:drop()
+---
+...
+-- Test complex JSON indexes
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'str', path='[1][3][2].a'}
+---
+...
+parts[2] = {1, 'unsigned', path = '[1][3][1]'}
+---
+...
+parts[3] = {2, 'str', path = '[2][2].d[1]'}
+---
+...
+pk = s:create_index('primary', { type = 'tree', parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+---
+- [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6,
+  [1, 2, 3]]
+...
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+---
+- error: Duplicate key exists in unique index 'primary' in space 'withdata'
+...
+parts = {}
+---
+...
+parts[1] = {4, 'unsigned', path='[4][1]', is_nullable = false}
+---
+...
+parts[2] = {4, 'unsigned', path='[4][2]', is_nullable = true}
+---
+...
+parts[3] = {4, 'unsigned', path='[4][4]', is_nullable = true}
+---
+...
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 4 document content
+    ''[]'': array size 0 is less than size of 1 defined in index'
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[1][3][2].b' }
+---
+...
+parts[2] = {3, 'unsigned'}
+---
+...
+crosspart_idx = s:create_index('crosspart', { parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[1][3][2].b'}
+---
+...
+num_idx = s:create_index('numeric', {parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:get(2)
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+num_idx:select()
+---
+- - [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+      9, 2, 3]]
+  - [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}],
+    6, [1, 2, 3]]
+  - [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+      0]]
+...
+num_idx:max()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:min()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+assert(crosspart_idx:max() == num_idx:max())
+---
+- true
+...
+assert(crosspart_idx:min() == num_idx:min())
+---
+- true
+...
+trap_idx:max()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+trap_idx:min()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+s:drop()
+---
+...
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk_simplified = s:create_index('primary', { type = 'tree',  parts = {{1, 'unsigned', path = '[1]'}}})
+---
+...
+assert(pk_simplified.path == box.NULL)
+---
+- true
+...
+idx = s:create_index('idx', {parts = {{2, 'integer', path = '[2].a'}}})
+---
+...
+s:insert{31, {a = 1, aa = -1}}
+---
+- [31, {'a': 1, 'aa': -1}]
+...
+s:insert{22, {a = 2, aa = -2}}
+---
+- [22, {'a': 2, 'aa': -2}]
+...
+s:insert{13, {a = 3, aa = -3}}
+---
+- [13, {'a': 3, 'aa': -3}]
+...
+idx:select()
+---
+- - [31, {'a': 1, 'aa': -1}]
+  - [22, {'a': 2, 'aa': -2}]
+  - [13, {'a': 3, 'aa': -3}]
+...
+idx:alter({parts = {{2, 'integer', path = '[2].aa'}}})
+---
+...
+idx:select()
+---
+- - [13, {'a': 3, 'aa': -3}]
+  - [22, {'a': 2, 'aa': -2}]
+  - [31, {'a': 1, 'aa': -1}]
+...
+s:drop()
+---
+...
 engine = nil
 ---
 ...
diff --git a/test/engine/tuple.test.lua b/test/engine/tuple.test.lua
index edc3dab..d563c66 100644
--- a/test/engine/tuple.test.lua
+++ b/test/engine/tuple.test.lua
@@ -312,5 +312,114 @@ tuple:tomap().fourth
 type(tuple:tomap().fourth)
 s:drop()
 
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO["fname"]'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'field.FIO.fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = '[3].FIO'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[3][1]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3][1].sname'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[2].FIO.fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO....fname'}}})
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+assert(idx ~= nil)
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+idx:select()
+idx:min()
+idx:max()
+s:drop()
+
+s = box.schema.create_space('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'unsigned', path='[1][2]'}
+pk = s:create_index('pk', {parts = parts})
+s:insert{{1, 2}, 3}
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+s:get(2)
+s:drop()
+
+-- Create index on space with data
+s = box.schema.space.create('withdata', {engine = engine})
+pk = s:create_index('primary', { type = 'tree' })
+s:insert{1, 7, {town = 'London', FIO = 1234}, 4, 5}
+s:insert{2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+_ = s:delete(1)
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+_ = s:delete(2)
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+_ = s:delete(4)
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]', is_nullable = true}, {3, 'str', path = '[3]["FIO"]["sname"]'}, {3, 'str', path = '[3]["FIO"]["extra"]', is_nullable = true}}})
+assert(idx ~= nil)
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = '[3]["FIO"]["fname"]'}}})
+idx2 = s:create_index('test2', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+assert(idx2 ~= nil)
+t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+-- Test field_map in tuple speed-up access by indexed path.
+t["[3][\"FIO\"][\"fname\"]"]
+idx:select()
+idx:min()
+idx:max()
+idx:drop()
+s:drop()
+
+-- Test complex JSON indexes
+s = box.schema.space.create('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'str', path='[1][3][2].a'}
+parts[2] = {1, 'unsigned', path = '[1][3][1]'}
+parts[3] = {2, 'str', path = '[2][2].d[1]'}
+pk = s:create_index('primary', { type = 'tree', parts =  parts})
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+parts = {}
+parts[1] = {4, 'unsigned', path='[4][1]', is_nullable = false}
+parts[2] = {4, 'unsigned', path='[4][2]', is_nullable = true}
+parts[3] = {4, 'unsigned', path='[4][4]', is_nullable = true}
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[1][3][2].b' }
+parts[2] = {3, 'unsigned'}
+crosspart_idx = s:create_index('crosspart', { parts =  parts})
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[1][3][2].b'}
+num_idx = s:create_index('numeric', {parts =  parts})
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+num_idx:get(2)
+num_idx:select()
+num_idx:max()
+num_idx:min()
+assert(crosspart_idx:max() == num_idx:max())
+assert(crosspart_idx:min() == num_idx:min())
+trap_idx:max()
+trap_idx:min()
+s:drop()
+
+s = box.schema.space.create('withdata', {engine = engine})
+pk_simplified = s:create_index('primary', { type = 'tree',  parts = {{1, 'unsigned', path = '[1]'}}})
+assert(pk_simplified.path == box.NULL)
+idx = s:create_index('idx', {parts = {{2, 'integer', path = '[2].a'}}})
+s:insert{31, {a = 1, aa = -1}}
+s:insert{22, {a = 2, aa = -2}}
+s:insert{13, {a = 3, aa = -3}}
+idx:select()
+idx:alter({parts = {{2, 'integer', path = '[2].aa'}}})
+idx:select()
+s:drop()
+
 engine = nil
 test_run = nil
-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3.1 4/5] box: introduce offset slot cache in key_part
  2018-09-20  7:46 [PATCH v3.1 0/5] box: indexes by JSON path Kirill Shcherbatov
                   ` (2 preceding siblings ...)
  2018-09-20  7:46 ` [PATCH v3.1 3/5] box: introduce JSON indexes Kirill Shcherbatov
@ 2018-09-20  7:46 ` Kirill Shcherbatov
  2018-09-20  7:46 ` [PATCH v3.1 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
  4 siblings, 0 replies; 9+ messages in thread
From: Kirill Shcherbatov @ 2018-09-20  7:46 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, Kirill Shcherbatov

Same key_part could be used in different formats multiple
times, so different field->offset_slot would be allocated.
In most scenarios we work with series of tuples of same
format, and (in general) format lookup for field would be
expensive operation for JSON-paths defined in key_part.
New slot_cache field in key_part structure and epoch-based
mechanism to validate it actuality should be effective
approach to improve performance.

Part of #1012.
---
 src/box/alter.cc                |  7 +++++--
 src/box/blackhole.c             |  5 +++--
 src/box/engine.h                | 11 +++++-----
 src/box/key_def.c               | 12 +++++++++--
 src/box/key_def.h               | 11 ++++++++++
 src/box/memtx_engine.c          |  4 ++--
 src/box/memtx_space.c           |  5 +++--
 src/box/memtx_space.h           |  2 +-
 src/box/schema.cc               |  4 ++--
 src/box/space.c                 |  4 ++--
 src/box/space.h                 |  8 +++++---
 src/box/sysview.c               |  3 ++-
 src/box/tuple.c                 |  4 ++--
 src/box/tuple_format.c          | 22 +++++++++++++++-----
 src/box/tuple_format.h          | 10 ++++++++-
 src/box/vinyl.c                 |  7 ++++---
 src/box/vy_lsm.c                | 45 +++++++++++++++++++++++++++++++++++++++--
 test/unit/vy_iterators_helper.c |  6 +++---
 test/unit/vy_mem.c              |  2 +-
 test/unit/vy_point_lookup.c     |  2 +-
 20 files changed, 132 insertions(+), 42 deletions(-)

diff --git a/src/box/alter.cc b/src/box/alter.cc
index a6299a1..b90ab3e 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -883,7 +883,10 @@ alter_space_do(struct txn *txn, struct alter_space *alter)
 	 * Create a new (empty) space for the new definition.
 	 * Sic: the triggers are not moved over yet.
 	 */
-	alter->new_space = space_new_xc(alter->space_def, &alter->key_list);
+	alter->new_space =
+		space_new_xc(alter->space_def, &alter->key_list,
+			     alter->old_space->format != NULL ?
+			     alter->old_space->format->epoch + 1 : 1);
 	/*
 	 * Copy the replace function, the new space is at the same recovery
 	 * phase as the old one. This hack is especially necessary for
@@ -1604,7 +1607,7 @@ on_replace_dd_space(struct trigger * /* trigger */, void *event)
 		auto def_guard =
 			make_scoped_guard([=] { space_def_delete(def); });
 		RLIST_HEAD(empty_list);
-		struct space *space = space_new_xc(def, &empty_list);
+		struct space *space = space_new_xc(def, &empty_list, 0);
 		/**
 		 * The new space must be inserted in the space
 		 * cache right away to achieve linearisable
diff --git a/src/box/blackhole.c b/src/box/blackhole.c
index f979304..517daf9 100644
--- a/src/box/blackhole.c
+++ b/src/box/blackhole.c
@@ -135,7 +135,7 @@ blackhole_engine_shutdown(struct engine *engine)
 
 static struct space *
 blackhole_engine_create_space(struct engine *engine, struct space_def *def,
-			      struct rlist *key_list)
+			      struct rlist *key_list, uint64_t epoch)
 {
 	if (!rlist_empty(key_list)) {
 		diag_set(ClientError, ER_UNSUPPORTED, "Blackhole", "indexes");
@@ -152,7 +152,8 @@ blackhole_engine_create_space(struct engine *engine, struct space_def *def,
 	/* Allocate tuples on runtime arena, but check space format. */
 	struct tuple_format *format;
 	format = tuple_format_new(&tuple_format_runtime->vtab, NULL, 0, 0,
-				  def->fields, def->field_count, def->dict);
+				  def->fields, def->field_count, def->dict,
+				  epoch);
 	if (format == NULL) {
 		free(space);
 		return NULL;
diff --git a/src/box/engine.h b/src/box/engine.h
index 5b96c74..0e8c76c 100644
--- a/src/box/engine.h
+++ b/src/box/engine.h
@@ -72,7 +72,8 @@ struct engine_vtab {
 	void (*shutdown)(struct engine *);
 	/** Allocate a new space instance. */
 	struct space *(*create_space)(struct engine *engine,
-			struct space_def *def, struct rlist *key_list);
+			struct space_def *def, struct rlist *key_list,
+			uint64_t epoch);
 	/**
 	 * Write statements stored in checkpoint @vclock to @stream.
 	 */
@@ -237,9 +238,9 @@ engine_find(const char *name)
 
 static inline struct space *
 engine_create_space(struct engine *engine, struct space_def *def,
-		    struct rlist *key_list)
+		    struct rlist *key_list, uint64_t epoch)
 {
-	return engine->vtab->create_space(engine, def, key_list);
+	return engine->vtab->create_space(engine, def, key_list, epoch);
 }
 
 static inline int
@@ -390,9 +391,9 @@ engine_find_xc(const char *name)
 
 static inline struct space *
 engine_create_space_xc(struct engine *engine, struct space_def *def,
-		    struct rlist *key_list)
+		    struct rlist *key_list, uint64_t epoch)
 {
-	struct space *space = engine_create_space(engine, def, key_list);
+	struct space *space = engine_create_space(engine, def, key_list, epoch);
 	if (space == NULL)
 		diag_raise();
 	return space;
diff --git a/src/box/key_def.c b/src/box/key_def.c
index 5366eb7..216d858 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -316,6 +316,8 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 	def->parts[part_no].type = type;
 	def->parts[part_no].coll = coll;
 	def->parts[part_no].coll_id = coll_id;
+	def->parts[part_no].offset_slot = TUPLE_OFFSET_SLOT_NIL;
+	def->parts[part_no].offset_slot_epoch = 0;
 	if (path != NULL) {
 		def->parts[part_no].path_len = path_len;
 		assert(def->parts[part_no].path != NULL);
@@ -774,9 +776,12 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 			new_def->parts[pos].path = data;
 			data += part->path_len + 1;
 		}
-		key_def_set_part(new_def, pos++, part->fieldno, part->type,
+		key_def_set_part(new_def, pos, part->fieldno, part->type,
 				 part->is_nullable, part->coll, part->coll_id,
 				 part->path, part->path_len);
+		new_def->parts[pos].offset_slot_epoch = part->offset_slot_epoch;
+		new_def->parts[pos].offset_slot = part->offset_slot;
+		pos++;
 	}
 
 	/* Set-append second key def's part to the new key def. */
@@ -789,9 +794,12 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 			new_def->parts[pos].path = data;
 			data += part->path_len + 1;
 		}
-		key_def_set_part(new_def, pos++, part->fieldno, part->type,
+		key_def_set_part(new_def, pos, part->fieldno, part->type,
 				 part->is_nullable, part->coll, part->coll_id,
 				 part->path, part->path_len);
+		new_def->parts[pos].offset_slot_epoch = part->offset_slot_epoch;
+		new_def->parts[pos].offset_slot = part->offset_slot;
+		pos++;
 	}
 	return new_def;
 }
diff --git a/src/box/key_def.h b/src/box/key_def.h
index d451258..66691c5 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -82,6 +82,17 @@ struct key_part {
 	uint32_t path_len;
 	/** JSON path hash. */
 	uint32_t path_hash;
+	/**
+	 * Epoch of offset slot cache. Initialized with
+	 * incremental epoch of format on caching it's field's
+	 * offset_slot via tuple_field_by_part_raw to speed up
+	 * access on subsequent calls with same format.
+	 * Cache is expected to use "the newest format is most
+	 * relevant" strategy.
+	 */
+	uint64_t offset_slot_epoch;
+	/** Cache with format's field offset slot. */
+	int32_t offset_slot;
 };
 
 struct key_def;
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 554d6f9..62c3c9a 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -358,10 +358,10 @@ memtx_engine_end_recovery(struct engine *engine)
 
 static struct space *
 memtx_engine_create_space(struct engine *engine, struct space_def *def,
-			  struct rlist *key_list)
+			  struct rlist *key_list, uint64_t epoch)
 {
 	struct memtx_engine *memtx = (struct memtx_engine *)engine;
-	return memtx_space_new(memtx, def, key_list);
+	return memtx_space_new(memtx, def, key_list, epoch);
 }
 
 static int
diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c
index 08ae0da..3ac606f 100644
--- a/src/box/memtx_space.c
+++ b/src/box/memtx_space.c
@@ -884,7 +884,7 @@ static const struct space_vtab memtx_space_vtab = {
 
 struct space *
 memtx_space_new(struct memtx_engine *memtx,
-		struct space_def *def, struct rlist *key_list)
+		struct space_def *def, struct rlist *key_list, uint64_t epoch)
 {
 	struct memtx_space *memtx_space = malloc(sizeof(*memtx_space));
 	if (memtx_space == NULL) {
@@ -910,7 +910,8 @@ memtx_space_new(struct memtx_engine *memtx,
 
 	struct tuple_format *format =
 		tuple_format_new(&memtx_tuple_format_vtab, keys, key_count, 0,
-				 def->fields, def->field_count, def->dict);
+				 def->fields, def->field_count, def->dict,
+				 epoch);
 	if (format == NULL) {
 		free(memtx_space);
 		return NULL;
diff --git a/src/box/memtx_space.h b/src/box/memtx_space.h
index 7dc3410..b5bec0c 100644
--- a/src/box/memtx_space.h
+++ b/src/box/memtx_space.h
@@ -79,7 +79,7 @@ memtx_space_replace_all_keys(struct space *, struct tuple *, struct tuple *,
 
 struct space *
 memtx_space_new(struct memtx_engine *memtx,
-		struct space_def *def, struct rlist *key_list);
+		struct space_def *def, struct rlist *key_list, uint64_t epoch);
 
 #if defined(__cplusplus)
 } /* extern "C" */
diff --git a/src/box/schema.cc b/src/box/schema.cc
index 2411c8c..cf7cf35 100644
--- a/src/box/schema.cc
+++ b/src/box/schema.cc
@@ -214,7 +214,7 @@ sc_space_new(uint32_t id, const char *name, struct key_def *key_def,
 	struct rlist key_list;
 	rlist_create(&key_list);
 	rlist_add_entry(&key_list, index_def, link);
-	struct space *space = space_new_xc(def, &key_list);
+	struct space *space = space_new_xc(def, &key_list, 0);
 	(void) space_cache_replace(space);
 	if (replace_trigger)
 		trigger_add(&space->on_replace, replace_trigger);
@@ -380,7 +380,7 @@ schema_init()
 			space_def_delete(def);
 		});
 		RLIST_HEAD(key_list);
-		struct space *space = space_new_xc(def, &key_list);
+		struct space *space = space_new_xc(def, &key_list, 0);
 		space_cache_replace(space);
 		init_system_space(space);
 		trigger_run_xc(&on_alter_space, space);
diff --git a/src/box/space.c b/src/box/space.c
index 871cc67..2e4df74 100644
--- a/src/box/space.c
+++ b/src/box/space.c
@@ -181,12 +181,12 @@ fail:
 }
 
 struct space *
-space_new(struct space_def *def, struct rlist *key_list)
+space_new(struct space_def *def, struct rlist *key_list, uint64_t epoch)
 {
 	struct engine *engine = engine_find(def->engine_name);
 	if (engine == NULL)
 		return NULL;
-	return engine_create_space(engine, def, key_list);
+	return engine_create_space(engine, def, key_list, epoch);
 }
 
 void
diff --git a/src/box/space.h b/src/box/space.h
index 8888ec8..068ea4b 100644
--- a/src/box/space.h
+++ b/src/box/space.h
@@ -378,10 +378,11 @@ struct field_def;
  * Allocate and initialize a space.
  * @param space_def Space definition.
  * @param key_list List of index_defs.
+ * @param epoch Last epoch to initialize format.
  * @retval Space object.
  */
 struct space *
-space_new(struct space_def *space_def, struct rlist *key_list);
+space_new(struct space_def *space_def, struct rlist *key_list, uint64_t epoch);
 
 /** Destroy and free a space. */
 void
@@ -416,9 +417,10 @@ int generic_space_prepare_alter(struct space *, struct space *);
 } /* extern "C" */
 
 static inline struct space *
-space_new_xc(struct space_def *space_def, struct rlist *key_list)
+space_new_xc(struct space_def *space_def, struct rlist *key_list,
+	     uint64_t epoch)
 {
-	struct space *space = space_new(space_def, key_list);
+	struct space *space = space_new(space_def, key_list, epoch);
 	if (space == NULL)
 		diag_raise();
 	return space;
diff --git a/src/box/sysview.c b/src/box/sysview.c
index a636c68..d35ff71 100644
--- a/src/box/sysview.c
+++ b/src/box/sysview.c
@@ -504,8 +504,9 @@ sysview_engine_shutdown(struct engine *engine)
 
 static struct space *
 sysview_engine_create_space(struct engine *engine, struct space_def *def,
-			    struct rlist *key_list)
+			    struct rlist *key_list, uint64_t epoch)
 {
+	(void)epoch;
 	struct space *space = (struct space *)calloc(1, sizeof(*space));
 	if (space == NULL) {
 		diag_set(OutOfMemory, sizeof(*space),
diff --git a/src/box/tuple.c b/src/box/tuple.c
index 83bdad1..1d27c63 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -231,7 +231,7 @@ tuple_init(field_name_hash_f hash)
 	 * Create a format for runtime tuples
 	 */
 	tuple_format_runtime = tuple_format_new(&tuple_format_runtime_vtab,
-						NULL, 0, 0, NULL, 0, NULL);
+						NULL, 0, 0, NULL, 0, NULL, 0);
 	if (tuple_format_runtime == NULL)
 		return -1;
 
@@ -403,7 +403,7 @@ box_tuple_format_new(struct key_def **keys, uint16_t key_count)
 {
 	box_tuple_format_t *format =
 		tuple_format_new(&tuple_format_runtime_vtab,
-				 keys, key_count, 0, NULL, 0, NULL);
+				 keys, key_count, 0, NULL, 0, NULL, 0);
 	if (format != NULL)
 		tuple_format_ref(format);
 	return format;
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index fa8b938..e750129 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -753,6 +753,7 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 		format->dict = dict;
 		tuple_dictionary_ref(dict);
 	}
+	format->epoch = 0;
 	format->refs = 0;
 	format->id = FORMAT_ID_NIL;
 	format->field_count = field_count;
@@ -789,7 +790,8 @@ struct tuple_format *
 tuple_format_new(struct tuple_format_vtab *vtab, struct key_def * const *keys,
 		 uint16_t key_count, uint16_t extra_size,
 		 const struct field_def *space_fields,
-		 uint32_t space_field_count, struct tuple_dictionary *dict)
+		 uint32_t space_field_count, struct tuple_dictionary *dict,
+		 uint64_t epoch)
 {
 	struct tuple_format *format =
 		tuple_format_alloc(keys, key_count, space_field_count, dict);
@@ -799,6 +801,7 @@ tuple_format_new(struct tuple_format_vtab *vtab, struct key_def * const *keys,
 	format->engine = NULL;
 	format->extra_size = extra_size;
 	format->is_temporary = false;
+	format->epoch = epoch;
 	if (tuple_format_register(format) < 0) {
 		tuple_format_destroy(format);
 		free(format);
@@ -1095,13 +1098,22 @@ tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
 
 	struct mh_strnptr_node_t *ht_record = NULL;
 	int32_t offset_slot;
-	if (format->path_hash != NULL &&
-	    (ht_record = json_path_hash_get(format->path_hash, part->path,
-					    part->path_len,
-					    part->path_hash)) != NULL) {
+	if (likely(part->offset_slot_epoch == format->epoch)) {
+		assert(format->epoch != 0);
+		offset_slot = part->offset_slot;
+	} else if (format->path_hash != NULL &&
+		   (ht_record = json_path_hash_get(format->path_hash,
+						   part->path, part->path_len,
+						   part->path_hash)) != NULL) {
 		struct tuple_field *field = ht_record->val;
 		assert(field != NULL);
 		offset_slot = field->offset_slot;
+		/* Cache offset_slot if required. */
+		if (part->offset_slot_epoch < format->epoch) {
+			assert(format->epoch != 0);
+			part->offset_slot = offset_slot;
+			part->offset_slot_epoch = format->epoch;
+		}
 	} else {
 		/*
 		 * Legacy tuple having no field map for
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index df500db..b347739 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -132,6 +132,12 @@ struct tuple_field {
  * Tuple format describes how tuple is stored and information about its fields
  */
 struct tuple_format {
+	/**
+	 * Counter that grows incrementally on space rebuild if
+	 * format has other distribution of offset slots comparing
+	 * with previous one.
+	 */
+	uint64_t epoch;
 	/** Virtual function table */
 	struct tuple_format_vtab vtab;
 	/** Pointer to engine-specific data. */
@@ -226,6 +232,7 @@ tuple_format_unref(struct tuple_format *format)
  * @param extra_size Extra bytes to reserve in tuples metadata.
  * @param space_fields Array of fields, defined in a space format.
  * @param space_field_count Length of @a space_fields.
+ * @param epoch Epoch of new format.
  *
  * @retval not NULL Tuple format.
  * @retval     NULL Memory error.
@@ -234,7 +241,8 @@ struct tuple_format *
 tuple_format_new(struct tuple_format_vtab *vtab, struct key_def * const *keys,
 		 uint16_t key_count, uint16_t extra_size,
 		 const struct field_def *space_fields,
-		 uint32_t space_field_count, struct tuple_dictionary *dict);
+		 uint32_t space_field_count, struct tuple_dictionary *dict,
+		 uint64_t epoch);
 
 /**
  * Check, if @a format1 can store any tuples of @a format2. For
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 073466f..fb9e0e0 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -584,7 +584,7 @@ vinyl_engine_check_space_def(struct space_def *def)
 
 static struct space *
 vinyl_engine_create_space(struct engine *engine, struct space_def *def,
-			  struct rlist *key_list)
+			  struct rlist *key_list, uint64_t epoch)
 {
 	struct space *space = malloc(sizeof(*space));
 	if (space == NULL) {
@@ -610,7 +610,8 @@ vinyl_engine_create_space(struct engine *engine, struct space_def *def,
 
 	struct tuple_format *format =
 		tuple_format_new(&vy_tuple_format_vtab, keys, key_count, 0,
-				 def->fields, def->field_count, def->dict);
+				 def->fields, def->field_count, def->dict,
+				 epoch);
 	if (format == NULL) {
 		free(space);
 		return NULL;
@@ -3030,7 +3031,7 @@ vy_send_lsm(struct vy_join_ctx *ctx, struct vy_lsm_recovery_info *lsm_info)
 	if (ctx->key_def == NULL)
 		goto out;
 	ctx->format = tuple_format_new(&vy_tuple_format_vtab, &ctx->key_def,
-				       1, 0, NULL, 0, NULL);
+				       1, 0, NULL, 0, NULL, 0);
 	if (ctx->format == NULL)
 		goto out_free_key_def;
 	tuple_format_ref(ctx->format);
diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
index b55933f..e0ce3ec 100644
--- a/src/box/vy_lsm.c
+++ b/src/box/vy_lsm.c
@@ -62,7 +62,7 @@ vy_lsm_env_create(struct vy_lsm_env *env, const char *path,
 		  void *upsert_thresh_arg)
 {
 	env->key_format = tuple_format_new(&vy_tuple_format_vtab,
-					   NULL, 0, 0, NULL, 0, NULL);
+					   NULL, 0, 0, NULL, 0, NULL, 0);
 	if (env->key_format == NULL)
 		return -1;
 	tuple_format_ref(env->key_format);
@@ -156,9 +156,50 @@ vy_lsm_new(struct vy_lsm_env *lsm_env, struct vy_cache_env *cache_env,
 	} else {
 		lsm->disk_format = tuple_format_new(&vy_tuple_format_vtab,
 						    &cmp_def, 1, 0, NULL, 0,
-						    NULL);
+						    NULL, format->epoch);
 		if (lsm->disk_format == NULL)
 			goto fail_format;
+		/*
+		 * Tuple formats should be compatible to make
+		 * epoch-based caching work.
+		 */
+		int32_t min_offset_slot = 0;
+		struct tuple_field *dst_fields = lsm->disk_format->fields;
+		struct mh_strnptr_t *dst_ht = lsm->disk_format->path_hash;
+		struct mh_strnptr_t *src_ht = format->path_hash;
+		struct key_part *part = cmp_def->parts;
+		struct key_part *part_end = part + cmp_def->part_count;
+		for (; part < part_end; part++) {
+			struct tuple_field *dst_field =
+				&dst_fields[part->fieldno];
+			struct tuple_field *src_field;
+			if (dst_field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
+				src_field = &format->fields[part->fieldno];
+			} else if (part->path != NULL) {
+				struct mh_strnptr_node_t *ht_record;
+				ht_record = json_path_hash_get(dst_ht,
+							       part->path,
+							       part->path_len,
+							       part->path_hash);
+				assert(ht_record != NULL);
+				dst_field = ht_record->val;
+				assert(dst_field != NULL);
+				ht_record = json_path_hash_get(src_ht,
+							       part->path,
+							       part->path_len,
+							       part->path_hash);
+				assert(ht_record != NULL);
+				src_field = ht_record->val;
+				assert(src_field != NULL);
+			} else {
+				continue;
+			}
+			if (src_field->offset_slot == TUPLE_OFFSET_SLOT_NIL)
+				continue;
+			dst_field->offset_slot = src_field->offset_slot;
+			min_offset_slot = MIN(src_field->offset_slot, min_offset_slot);
+		}
+		lsm->disk_format->field_map_size = -min_offset_slot * sizeof(uint32_t);
 	}
 	tuple_format_ref(lsm->disk_format);
 
diff --git a/test/unit/vy_iterators_helper.c b/test/unit/vy_iterators_helper.c
index 40d8d6a..bfef6a2 100644
--- a/test/unit/vy_iterators_helper.c
+++ b/test/unit/vy_iterators_helper.c
@@ -22,7 +22,7 @@ vy_iterator_C_test_init(size_t cache_size)
 	vy_cache_env_create(&cache_env, cord_slab_cache());
 	vy_cache_env_set_quota(&cache_env, cache_size);
 	vy_key_format = tuple_format_new(&vy_tuple_format_vtab, NULL, 0, 0,
-					 NULL, 0, NULL);
+					 NULL, 0, NULL, 0);
 	tuple_format_ref(vy_key_format);
 
 	size_t mem_size = 64 * 1024 * 1024;
@@ -210,7 +210,7 @@ create_test_mem(struct key_def *def)
 	struct key_def * const defs[] = { def };
 	struct tuple_format *format =
 		tuple_format_new(&vy_tuple_format_vtab, defs, def->part_count,
-				 0, NULL, 0, NULL);
+				 0, NULL, 0, NULL, 0);
 	fail_if(format == NULL);
 
 	/* Create format with column mask */
@@ -234,7 +234,7 @@ create_test_cache(uint32_t *fields, uint32_t *types,
 	assert(*def != NULL);
 	vy_cache_create(cache, &cache_env, *def, true);
 	*format = tuple_format_new(&vy_tuple_format_vtab, def, 1, 0, NULL, 0,
-				   NULL);
+				   NULL, 0);
 	tuple_format_ref(*format);
 }
 
diff --git a/test/unit/vy_mem.c b/test/unit/vy_mem.c
index 967aabe..d797b3d 100644
--- a/test/unit/vy_mem.c
+++ b/test/unit/vy_mem.c
@@ -78,7 +78,7 @@ test_iterator_restore_after_insertion()
 	/* Create format */
 	struct tuple_format *format = tuple_format_new(&vy_tuple_format_vtab,
 						       &key_def, 1, 0, NULL, 0,
-						       NULL);
+						       NULL, 0);
 	assert(format != NULL);
 	tuple_format_ref(format);
 
diff --git a/test/unit/vy_point_lookup.c b/test/unit/vy_point_lookup.c
index 86877d7..f5756ca 100644
--- a/test/unit/vy_point_lookup.c
+++ b/test/unit/vy_point_lookup.c
@@ -85,7 +85,7 @@ test_basic()
 	vy_cache_create(&cache, &cache_env, key_def, true);
 	struct tuple_format *format = tuple_format_new(&vy_tuple_format_vtab,
 						       &key_def, 1, 0, NULL, 0,
-						       NULL);
+						       NULL, 0);
 	isnt(format, NULL, "tuple_format_new is not NULL");
 	tuple_format_ref(format);
 
-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3.1 5/5] box: specify indexes in user-friendly form
  2018-09-20  7:46 [PATCH v3.1 0/5] box: indexes by JSON path Kirill Shcherbatov
                   ` (3 preceding siblings ...)
  2018-09-20  7:46 ` [PATCH v3.1 4/5] box: introduce offset slot cache in key_part Kirill Shcherbatov
@ 2018-09-20  7:46 ` Kirill Shcherbatov
  4 siblings, 0 replies; 9+ messages in thread
From: Kirill Shcherbatov @ 2018-09-20  7:46 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, Kirill Shcherbatov

Since now it is possible to create indexes by JSON-path
using field names specified in format.

Closes #1012.

@TarantoolBot document
Title: Indexes by JSON path
Sometimes field data could have complex document structure.
When this structure is consistent across whole document,
you are able to create an index by JSON path.

Example:
s:create_index('json_index',
           {parts = {{'data.FIO["fname"]', 'str'}}})
---
 src/box/lua/index.c         | 73 +++++++++++++++++++++++++++++++++++++++++++++
 src/box/lua/schema.lua      | 20 ++++++-------
 src/box/tuple_format.c      |  3 +-
 test/engine/iterator.result |  2 +-
 test/engine/tuple.result    | 41 +++++++++++++++++++++++++
 test/engine/tuple.test.lua  | 12 ++++++++
 6 files changed, 137 insertions(+), 14 deletions(-)

diff --git a/src/box/lua/index.c b/src/box/lua/index.c
index ef89c39..8f1b038 100644
--- a/src/box/lua/index.c
+++ b/src/box/lua/index.c
@@ -28,6 +28,9 @@
  * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
+#include "box/schema.h"
+#include "box/tuple_format.h"
+#include "json/path.h"
 #include "box/lua/index.h"
 #include "lua/utils.h"
 #include "box/box.h"
@@ -328,6 +331,75 @@ lbox_index_compact(lua_State *L)
 	return 0;
 }
 
+static int
+lbox_index_resolve_path(struct lua_State *L)
+{
+	if (lua_gettop(L) != 3 ||
+	    !lua_isnumber(L, 1) || !lua_isnumber(L, 2) || !lua_isstring(L, 3)) {
+		return luaL_error(L, "Usage box.internal."
+				     "path_resolve(part_id, space_id, path)");
+	}
+	uint32_t part_id = lua_tonumber(L, 1);
+	uint32_t space_id = lua_tonumber(L, 2);
+	size_t path_len;
+	const char *path = lua_tolstring(L, 3, &path_len);
+	struct space *space = space_cache_find(space_id);
+	if (space == NULL)
+		return luaT_error(L);
+	struct json_path_parser parser;
+	struct json_path_node node;
+	json_path_parser_create(&parser, path, path_len);
+	int rc = json_path_next(&parser, &node);
+	if (rc != 0) {
+		const char *err_msg =
+			tt_sprintf("options.parts[%d]: error in path on "
+				   "position %d", part_id, rc);
+		diag_set(ClientError, ER_ILLEGAL_PARAMS, err_msg);
+		return luaT_error(L);
+	}
+	assert(space->format != NULL && space->format->dict != NULL);
+	uint32_t fieldno;
+	if (node.type == JSON_PATH_NUM &&
+	    (fieldno = node.num - TUPLE_INDEX_BASE) >=
+	    space->format->field_count) {
+		const char *err_msg =
+			tt_sprintf("options.parts[%d]: field '%d' referenced "
+				   "in path is greater than format field "
+				   "count %d", part_id,
+				   fieldno + TUPLE_INDEX_BASE,
+				   space->format->field_count);
+		diag_set(ClientError, ER_ILLEGAL_PARAMS, err_msg);
+		return luaT_error(L);
+	} else if (node.type == JSON_PATH_STR &&
+		   tuple_fieldno_by_name(space->format->dict, node.str, node.len,
+					 field_name_hash(node.str, node.len),
+					 &fieldno) != 0) {
+		const char *err_msg =
+			tt_sprintf("options.parts[%d]: field was not found by "
+				   "name '%.*s'", part_id, node.len, node.str);
+		diag_set(ClientError, ER_ILLEGAL_PARAMS, err_msg);
+		return luaT_error(L);
+	}
+	fieldno += TUPLE_INDEX_BASE;
+
+	char *path_resolved = region_alloc(&fiber()->gc, path_len + 1);
+	if (path_resolved == NULL) {
+		diag_set(OutOfMemory, path_len + 1, "region_alloc",
+			"path_resolved");
+		return luaT_error(L);
+	}
+
+	path = path_resolved;
+	path_resolved += sprintf(path_resolved, "[%d]", fieldno);
+	memcpy(path_resolved, parser.src + parser.offset,
+		parser.src_len - parser.offset);
+	path_resolved[parser.src_len - parser.offset] = '\0';
+
+	lua_pushnumber(L, fieldno);
+	lua_pushstring(L, path);
+	return 2;
+}
+
 /* }}} */
 
 void
@@ -365,6 +437,7 @@ box_lua_index_init(struct lua_State *L)
 		{"truncate", lbox_truncate},
 		{"stat", lbox_index_stat},
 		{"compact", lbox_index_compact},
+		{"path_resolve", lbox_index_resolve_path},
 		{NULL, NULL}
 	};
 
diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
index 540a2a5..bc11375 100644
--- a/src/box/lua/schema.lua
+++ b/src/box/lua/schema.lua
@@ -556,7 +556,7 @@ local function update_index_parts_1_6_0(parts)
     return result
 end
 
-local function update_index_parts(format, parts)
+local function update_index_parts(format, parts, space_id)
     if type(parts) ~= "table" then
         box.error(box.error.ILLEGAL_PARAMS,
         "options.parts parameter should be a table")
@@ -607,16 +607,14 @@ local function update_index_parts(format, parts)
             box.error(box.error.ILLEGAL_PARAMS,
                       "options.parts[" .. i .. "]: field (name or number) is expected")
         elseif type(part.field) == 'string' then
-            for k,v in pairs(format) do
-                if v.name == part.field then
-                    part.field = k
-                    break
-                end
-            end
-            if type(part.field) == 'string' then
+            local idx, path = box.internal.path_resolve(i, space_id, part.field)
+            if part.path ~= nil and part.path ~= path then
                 box.error(box.error.ILLEGAL_PARAMS,
-                          "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")
+                          "options.parts[" .. i .. "]: field path '"..part.path.." doesn't math path resolved by name '" .. part.field .. "'")
             end
+            parts_can_be_simplified = parts_can_be_simplified and path == nil
+            part.field = idx
+            part.path = path or part.path
         elseif part.field == 0 then
             box.error(box.error.ILLEGAL_PARAMS,
                       "options.parts[" .. i .. "]: field (number) must be one-based")
@@ -767,7 +765,7 @@ box.schema.index.create = function(space_id, name, options)
         end
     end
     local parts, parts_can_be_simplified =
-        update_index_parts(format, options.parts)
+        update_index_parts(format, options.parts, space_id)
     -- create_index() options contains type, parts, etc,
     -- stored separately. Remove these members from index_opts
     local index_opts = {
@@ -934,7 +932,7 @@ box.schema.index.alter = function(space_id, index_id, options)
     if options.parts then
         local parts_can_be_simplified
         parts, parts_can_be_simplified =
-            update_index_parts(format, options.parts)
+            update_index_parts(format, options.parts, space_id)
         -- save parts in old format if possible
         if parts_can_be_simplified then
             parts = simplify_index_parts(parts)
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index e750129..a33e473 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -1119,8 +1119,7 @@ tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
 		 * Legacy tuple having no field map for
 		 * JSON index.
 		 */
-		uint32_t path_hash =
-			field_name_hash(part->path, part->path_len);
+		uint32_t path_hash = field_name_hash(part->path, part->path_len);
 		const char *raw = NULL;
 		if (tuple_field_raw_by_path(format, data, field_map,
 					    part->path, part->path_len,
diff --git a/test/engine/iterator.result b/test/engine/iterator.result
index 10097ed..05d892d 100644
--- a/test/engine/iterator.result
+++ b/test/engine/iterator.result
@@ -4213,7 +4213,7 @@ s:replace{35}
 ...
 state, value = gen(param,state)
 ---
-- error: 'builtin/box/schema.lua:1034: usage: next(param, state)'
+- error: 'builtin/box/schema.lua:1032: usage: next(param, state)'
 ...
 value
 ---
diff --git a/test/engine/tuple.result b/test/engine/tuple.result
index b74bb23..d28d6b4 100644
--- a/test/engine/tuple.result
+++ b/test/engine/tuple.result
@@ -1009,6 +1009,47 @@ assert(idx ~= nil)
 ---
 - true
 ...
+format = {{'int1', 'unsigned'}, {'int2', 'unsigned'}, {'data', 'array'}, {'int3', 'unsigned'}, {'int4', 'unsigned'}}
+---
+...
+s:format(format)
+---
+- error: Field 3 has type 'map' in one index, but type 'array' in another
+...
+format = {{'int1', 'unsigned'}, {'int2', 'unsigned'}, {'data', 'map'}, {'int3', 'unsigned'}, {'int4', 'unsigned'}}
+---
+...
+s:format(format)
+---
+...
+s:create_index('test3', {parts = {{2, 'number'}, {']sad.FIO["fname"]', 'str'}}})
+---
+- error: 'Illegal parameters, options.parts[2]: error in path on position 1'
+...
+s:create_index('test3', {parts = {{2, 'number'}, {'[666].FIO["fname"]', 'str'}}})
+---
+- error: 'Illegal parameters, options.parts[2]: field ''666'' referenced in path is
+    greater than format field count 5'
+...
+s:create_index('test3', {parts = {{2, 'number'}, {'invalid.FIO["fname"]', 'str'}}})
+---
+- error: 'Illegal parameters, options.parts[2]: field was not found by name ''invalid'''
+...
+idx3 = s:create_index('test3', {parts = {{2, 'number'}, {'data.FIO["fname"]', 'str'}}})
+---
+...
+assert(idx3 ~= nil)
+---
+- true
+...
+assert(idx3.parts[2].path == "[3][\"FIO\"][\"fname\"]")
+---
+- true
+...
+-- Vinyl has optimizations that omit index checks, so errors could differ.
+idx3:drop()
+---
+...
 s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
 ---
 - error: 'Tuple doesn''t math document structure: invalid field 3 document content
diff --git a/test/engine/tuple.test.lua b/test/engine/tuple.test.lua
index d563c66..1508f99 100644
--- a/test/engine/tuple.test.lua
+++ b/test/engine/tuple.test.lua
@@ -327,6 +327,18 @@ s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[2].FIO.fnam
 s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO....fname'}}})
 idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
 assert(idx ~= nil)
+format = {{'int1', 'unsigned'}, {'int2', 'unsigned'}, {'data', 'array'}, {'int3', 'unsigned'}, {'int4', 'unsigned'}}
+s:format(format)
+format = {{'int1', 'unsigned'}, {'int2', 'unsigned'}, {'data', 'map'}, {'int3', 'unsigned'}, {'int4', 'unsigned'}}
+s:format(format)
+s:create_index('test3', {parts = {{2, 'number'}, {']sad.FIO["fname"]', 'str'}}})
+s:create_index('test3', {parts = {{2, 'number'}, {'[666].FIO["fname"]', 'str'}}})
+s:create_index('test3', {parts = {{2, 'number'}, {'invalid.FIO["fname"]', 'str'}}})
+idx3 = s:create_index('test3', {parts = {{2, 'number'}, {'data.FIO["fname"]', 'str'}}})
+assert(idx3 ~= nil)
+assert(idx3.parts[2].path == "[3][\"FIO\"][\"fname\"]")
+-- Vinyl has optimizations that omit index checks, so errors could differ.
+idx3:drop()
 s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
 s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
 s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3.1 1/5] box: refactor API to use non-constant key_def
  2018-09-20  7:46 ` [PATCH v3.1 1/5] box: refactor API to use non-constant key_def Kirill Shcherbatov
@ 2018-09-21  9:30   ` Vladimir Davydov
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Davydov @ 2018-09-21  9:30 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

Pushed to 1.10

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3.1 2/5] box: introduce tuple_field_by_part routine
  2018-09-20  7:46 ` [PATCH v3.1 2/5] box: introduce tuple_field_by_part routine Kirill Shcherbatov
@ 2018-09-21  9:33   ` Vladimir Davydov
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Davydov @ 2018-09-21  9:33 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Thu, Sep 20, 2018 at 10:46:50AM +0300, Kirill Shcherbatov wrote:
> +const char *
> +tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
> +			const uint32_t *field_map, struct key_part *part)
> +{
> +	return tuple_field_raw(format, data, field_map, part->fieldno);
> +}

This function must be defined as inline for now so as not to introduce
any slowdown to hot paths. I moved it to tuple_format.h and pushed the
patch to 1.10. You'll have to move it back to tuple_format.c in the main
patch of the series, which makes it more than just a wrapper.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3.1 3/5] box: introduce JSON indexes
  2018-09-20  7:46 ` [PATCH v3.1 3/5] box: introduce JSON indexes Kirill Shcherbatov
@ 2018-09-22 19:15   ` Vladimir Davydov
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Davydov @ 2018-09-22 19:15 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

Hello,

First, the RFC got lost somewhere between reviews. Please resurrect it.
(Personally, I'm against committing RFCs to the repository, but rules
are rules).

Second, there are a few places in the patch where you simultaneously
move and modify huge chunks of code. Please don't do that, because it
makes the patch very difficult to review. Try to avoid moving chunks of
code, but if you absolutely have to, then do it in a separate patch.
Also, don't ever leave stray hunks left from previous versions (there
are a few, too). I won't bother pointing you to exact places in the
patch - you'll have to review your patch and find them by yourself.

Now, about the patch itself. I've a few major comments, which will
probably force you to rework a big part of your patch. I won't give
you a line-by-line review until those are resolved or you convince me
to leave them be:

 1. For some reason, you require that the first component of the json
    path points to the tuple field, i.e. if a tuple field has index 3,
    then the json path specified by the user and stored in the _index
    system space must start with '[3]' too. Such a redundancy looks
    unnecessary and dubious. I assume that you do that so that you can
    maintain a single hash mapping normalized path to tuple_field in
    tuple_format. If so, that doesn't justify this decision, as you can
    as well keep a hash map in each root tuple_field. Please make the
    path point to data within a field, excluding the field itself.

 2. AFAIU (I may be wrong, but it looks like it to me) the hash mapping
    a normalized path to a tuple field (the one that is in tuple_format)
    is a pure optimization. Hence it makes sense to submit it in a
    separate patch, applied on top of the main one. Since the sole
    purpose of path normalization is this optimization, it should be
    factored out and submitted separately, too.

 3. Allocation of json paths after key_def and tuple_format structures
    looks completely pointless to me. Apart from complicating the code,
    this micro optimization achieves nothing. It looks especially
    ridiculous in case of tuple_format, which has to maintain a hash
    table, which implies making multiple allocations anyway. Please
    remove it and allocate the paths in a normal way, with malloc().

    BTW, there's another place where you micro manage memory - it's json
    path normalization procedure, where you try to reuse a buffer left
    from the previous loop iteration. Again, don't do that, it isn't
    worth the complexity it brings to the code.

 4. All json related methods should be defined in src/lib/json as
    independent entities. This concerns json path validation and
    normalization, as well as json tree construction and traversal. And
    each of them should be submitted in a separate patch accompanied by
    a unit test.

    Designing the json tree class and the corresponding traversal method
    will probably require some mental effort, but it'll be worth it,
    because it'll let you remove some duplicate code, as well as split
    tuple format validation and field map construction, which are
    currently done by the same function depending on the arguments (this
    looks bad). Not mentioning that it will also make the code much
    easier to follow and review.

Those are the major points, but there are also a few minor ones
concerning the coding technique that I couldn't help noticing:

 - Very poor comments - it looks like you write them just to follow the
   code style guideline. There are a lot of subtleties in this patch
   (to name a few, what the canonical form of a json path is and why it
   is needed, what the tuple_field tree looks like and why it exists,
   how json paths are allocated and where they are stored), but you
   never elaborate in the comments - you only briefly mention what a
   function does or what a struct field stores, but never explain why or
   what's the catch. I understand that writing comments in English is
   difficult (it is for me too! Sometimes I spend more time on writing
   a comment than on the code), but it is a lame excuse. Please take
   your time, use a dictionary, consult with your colleagues to write
   good comments that are easy to read and that shed some light on
   implementation details so that one can understand what's going on
   without diving deep in the implementation.

 - Weird function names - what's tuple_field_bypass_and_init supposed to
   mean, for example? If you looked up "bypass" you'd see that it means
   "go past/around" or "avoid/evade". This is far from what this
   function is about, which is traversing the tuple field tree. Probably
   this is related to my previous point... Please be more scrupulous
   when it comes to function and variable naming, otherwise your code
   will sound hilarious.

 - Rather strange way to divide the code into functions. For instance,
   take tuple_field_dig_with_parser. Why would you beget such a monster
   and let it out in the open? Outside tuple_format.c you only need two
   ways of looking up data by path - absolute (i.e. including the field
   number) for Lua and relative (i.e. without the field number) for
   extracting a key. So introduce two functions with appropriate names,
   say tuple_field_by_relative_path and tuple_field_by_absolute_path
   (think about the names, please). If we ever need the full power of
   tuple_field_dig_with_parser (the name sucks BTW) somewhere else, we
   will refactor the code again.

   Another example is tuple_field_by_part. I'd expect it to simply call
   tuple_field or tuple_field_by_path, depending on whether the part has
   a json path attached to it or not. However, this function also does
   an optimistic lookup in tuple_format->path_hash. Why can't we do this
   lookup in tuple_field_by_path? Don't we want to optimize Lua calls to
   tuple_field_by_path too? Oh wait, it does a lookup in the hash...
   What's going on there?

Those are the minor points. Again, I won't be pointing to exact places
in the patch, because that would be too tedious. I expect you to go over
your patch and fix them on your own. I'll correct you in the next review
round if you miss anything.

And please don't forget to update the comment to the patch and probably
the RFC when you fix the major points (provided we agree on them, of
course).

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-09-22 19:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20  7:46 [PATCH v3.1 0/5] box: indexes by JSON path Kirill Shcherbatov
2018-09-20  7:46 ` [PATCH v3.1 1/5] box: refactor API to use non-constant key_def Kirill Shcherbatov
2018-09-21  9:30   ` Vladimir Davydov
2018-09-20  7:46 ` [PATCH v3.1 2/5] box: introduce tuple_field_by_part routine Kirill Shcherbatov
2018-09-21  9:33   ` Vladimir Davydov
2018-09-20  7:46 ` [PATCH v3.1 3/5] box: introduce JSON indexes Kirill Shcherbatov
2018-09-22 19:15   ` Vladimir Davydov
2018-09-20  7:46 ` [PATCH v3.1 4/5] box: introduce offset slot cache in key_part Kirill Shcherbatov
2018-09-20  7:46 ` [PATCH v3.1 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox