[PATCH v9 0/6] box: Indexes by JSON path

Tarantool development patches archive
 help / color / mirror / Atom feed

* [PATCH v9 0/6] box: Indexes by JSON path
@ 2019-02-03 10:20 Kirill Shcherbatov
  2019-02-03 10:20 ` [PATCH v9 1/6] lib: update msgpuck library Kirill Shcherbatov
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-03 10:20 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

Sometimes field data could have complex document structure.
When this structure is consistent across whole document,
you are able to create an index by JSON path.

Changes in version 9:
    - new test cases for snapshooting, recovery, selects by full
      and partial; primary and secondary indexes
    - fixed corrupted vylog - there was invalid key_part_def
      duplication in vy_log.c
    - fixed vynyl's replace operation
    - support for nullable key_def in
      tuple_extract_key_slowpath_raw even it is never used with
      has_optional_parts = true in Tarantool (since everywhere it
      is used for pk that can't contain nullable parts)
    - better comments everywhere

v8:
https://www.freelists.org/post/tarantool-patches/PATCH-v8-55-box-specify-indexes-in-userfriendly-form,1
v7:
https://www.freelists.org/post/tarantool-patches/PATCH-v7-55-box-specify-indexes-in-userfriendly-form,1
v6:
https://www.freelists.org/post/tarantool-patches/PATCH-v6-88-box-specify-indexes-in-userfriendly-form,1
v5:
https://www.freelists.org/post/tarantool-patches/PATCH-v5-99-box-specify-indexes-in-userfriendly-form,1

http://github.com/tarantool/tarantool/tree/kshch/gh-1012-json-indexes
https://github.com/tarantool/tarantool/issues/1012

Kirill Shcherbatov (6):
  lib: update msgpuck library
  box: introduce tuple_field_raw_by_path routine
  box: introduce JSON Indexes
  box: introduce has_json_paths flag in templates
  box: introduce offset_slot cache in key_part
  box: specify indexes in user-friendly form

 src/box/alter.cc             |   2 +-
 src/box/index_def.c          |  10 +-
 src/box/key_def.c            | 162 +++++++--
 src/box/key_def.h            |  48 ++-
 src/box/lua/schema.lua       | 100 ++++-
 src/box/lua/space.cc         |   5 +
 src/box/lua/tuple.c          |   9 +-
 src/box/memtx_engine.c       |   4 +
 src/box/sql.c                |   1 +
 src/box/sql/build.c          |   1 +
 src/box/sql/select.c         |   3 +-
 src/box/sql/where.c          |   1 +
 src/box/tuple.h              |  60 +--
 src/box/tuple_compare.cc     | 115 ++++--
 src/box/tuple_extract_key.cc | 135 ++++---
 src/box/tuple_format.c       | 441 ++++++++++++++++------
 src/box/tuple_format.h       | 200 ++++++----
 src/box/tuple_hash.cc        |  40 +-
 src/box/vinyl.c              |   4 +
 src/box/vy_log.c             |  61 +++-
 src/box/vy_point_lookup.c    |   4 +-
 src/box/vy_stmt.c            | 202 ++++++++---
 src/lib/msgpuck              |   2 +-
 test/engine/json.result      | 685 +++++++++++++++++++++++++++++++++++
 test/engine/json.test.lua    | 194 ++++++++++
 test/engine/tuple.result     |  16 +-
 test/unit/msgpack.result     |   4 +-
 27 files changed, 2095 insertions(+), 414 deletions(-)
 create mode 100644 test/engine/json.result
 create mode 100644 test/engine/json.test.lua

-- 
2.20.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 1/6] lib: update msgpuck library
  2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov
@ 2019-02-03 10:20 ` Kirill Shcherbatov
  2019-02-04  9:48   ` Vladimir Davydov
  2019-02-03 10:20 ` [PATCH v9 2/6] box: introduce tuple_field_raw_by_path routine Kirill Shcherbatov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-03 10:20 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

The msgpack dependency has been updated because the new version
introduces the new mp_stack class which we will use to parse
tuple without recursion when initializing the field map.

Needed for #1012
---
 src/lib/msgpuck          | 2 +-
 test/unit/msgpack.result | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/lib/msgpuck b/src/lib/msgpuck
index 3b8f3e59b..473372ec0 160000
--- a/src/lib/msgpuck
+++ b/src/lib/msgpuck
@@ -1 +1 @@
-Subproject commit 3b8f3e59b62d74f0198e01cbec0beb9c6a3082fb
+Subproject commit 473372ec0b111ff0731857bb4c45409866cb3a5d
diff --git a/test/unit/msgpack.result b/test/unit/msgpack.result
index 12e8f8626..385e7da0e 100644
--- a/test/unit/msgpack.result
+++ b/test/unit/msgpack.result
@@ -1633,7 +1633,7 @@ ok 15 - subtests
     ok 282 - buffer overflow on step 70
     # *** test_format: done ***
 ok 16 - subtests
-    1..10
+    1..12
     # *** test_mp_print ***
     ok 1 - mp_snprint return value
     ok 2 - mp_snprint result
@@ -1645,6 +1645,8 @@ ok 16 - subtests
     ok 8 - mp_fprint return value
     ok 9 - mp_fprint result
     ok 10 - mp_fprint I/O error
+    ok 11 - mp_snprint max nesting depth return value
+    ok 12 - mp_snprint max nesting depth result
     # *** test_mp_print: done ***
 ok 17 - subtests
     1..65
-- 
2.20.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 1/6] lib: update msgpuck library
  2019-02-03 10:20 ` [PATCH v9 1/6] lib: update msgpuck library Kirill Shcherbatov
@ 2019-02-04  9:48   ` Vladimir Davydov
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Davydov @ 2019-02-04  9:48 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Sun, Feb 03, 2019 at 01:20:21PM +0300, Kirill Shcherbatov wrote:
> The msgpack dependency has been updated because the new version
> introduces the new mp_stack class which we will use to parse
> tuple without recursion when initializing the field map.
> 
> Needed for #1012
> ---
>  src/lib/msgpuck          | 2 +-
>  test/unit/msgpack.result | 4 +++-
>  2 files changed, 4 insertions(+), 2 deletions(-)

Pushed this one to 2.1.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 2/6] box: introduce tuple_field_raw_by_path routine
  2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov
  2019-02-03 10:20 ` [PATCH v9 1/6] lib: update msgpuck library Kirill Shcherbatov
@ 2019-02-03 10:20 ` Kirill Shcherbatov
  2019-02-04 10:37   ` Vladimir Davydov
  2019-02-03 10:20 ` [PATCH v9 3/6] box: introduce JSON Indexes Kirill Shcherbatov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-03 10:20 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

Introduced a new function tuple_field_raw_by_path is used to get
tuple fields by field index and relative JSON path. This routine
uses tuple_format's field_map if possible. It will be further
extended to use JSON indexes.
The old tuple_field_raw_by_path routine used to work with full
JSON paths, renamed tuple_field_raw_by_full_path. It's return
value type is changed to const char * because the other similar
functions tuple_field_raw and tuple_field_by_part_raw use this
convention.
Got rid of reporting error position for 'invalid JSON path' error
in lbox_tuple_field_by_path because we can't extend other
routines to behave such way that makes an API inconsistent,
moreover such error are useless and confusing.

Needed for #1012
---
 src/box/lua/tuple.c      |   9 +--
 src/box/tuple.h          |  60 +++++++++---------
 src/box/tuple_format.c   |  63 ++++++-------------
 src/box/tuple_format.h   | 127 +++++++++++++++++++++------------------
 test/engine/tuple.result |  16 ++---
 5 files changed, 129 insertions(+), 146 deletions(-)

diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 1867f810f..7d377b69e 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -463,13 +463,10 @@ lbox_tuple_field_by_path(struct lua_State *L)
 	const char *field = NULL, *path = lua_tolstring(L, 2, &len);
 	if (len == 0)
 		return 0;
-	if (tuple_field_by_path(tuple, path, (uint32_t) len,
-				lua_hashstring(L, 2), &field) != 0) {
-		return luaT_error(L);
-	} else if (field == NULL) {
+	field = tuple_field_by_full_path(tuple, path, (uint32_t)len,
+					 lua_hashstring(L, 2));
+	if (field == NULL)
 		return 0;
-	}
-	assert(field != NULL);
 	luamp_decode(L, luaL_msgpack_default, &field);
 	return 1;
 }
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 83e5b7013..c3cd689fd 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -513,6 +513,24 @@ tuple_field(const struct tuple *tuple, uint32_t fieldno)
 			       tuple_field_map(tuple), fieldno);
 }
 
+/**
+ * Get tuple field by its root field index and relative
+ * JSON path.
+ * @param tuple Tuple to get the field from.
+ * @param fieldno The index of root field.
+ * @param path Relative JSON path to field data.
+ * @param path_len The length of the @path.
+ * @retval Field data if the field exists or NULL.
+ */
+static inline const char *
+tuple_field_by_path(const struct tuple *tuple, uint32_t fieldno,
+		    const char *path, uint32_t path_len)
+{
+	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
+				       tuple_field_map(tuple), fieldno,
+				       path, path_len);
+}
+
 /**
  * Get a field refereed by index @part in tuple.
  * @param tuple Tuple to get the field from.
@@ -527,43 +545,25 @@ tuple_field_by_part(const struct tuple *tuple, struct key_part *part)
 }
 
 /**
- * Get tuple field by its JSON path.
+ * Get tuple field by full JSON path.
+ * Unlike tuple_field_by_path this function works with full JSON
+ * paths, performing root field index resolve on its own.
+ * When the first JSON path token has JSON_TOKEN_STR type, routine
+ * uses tuple format dictionary to get field index by field name.
  * @param tuple Tuple to get field from.
- * @param path Field JSON path.
+ * @param path Full JSON path to field.
  * @param path_len Length of @a path.
  * @param path_hash Hash of @a path.
- * @param[out] field Found field, or NULL, if not found.
- *
- * @retval  0 Success.
- * @retval -1 Error in JSON path.
- */
-static inline int
-tuple_field_by_path(const struct tuple *tuple, const char *path,
-                    uint32_t path_len, uint32_t path_hash,
-                    const char **field)
-{
-	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
-	                               tuple_field_map(tuple), path, path_len,
-	                               path_hash, field);
-}
-
-/**
- * Get tuple field by its name.
- * @param tuple Tuple to get field from.
- * @param name Field name.
- * @param name_len Length of @a name.
- * @param name_hash Hash of @a name.
  *
- * @retval not NULL MessagePack field.
- * @retval     NULL No field with @a name.
+ * @retval field data if field exists or NULL.
  */
 static inline const char *
-tuple_field_by_name(const struct tuple *tuple, const char *name,
-		    uint32_t name_len, uint32_t name_hash)
+tuple_field_by_full_path(const struct tuple *tuple, const char *path,
+			 uint32_t path_len, uint32_t path_hash)
 {
-	return tuple_field_raw_by_name(tuple_format(tuple), tuple_data(tuple),
-				       tuple_field_map(tuple), name, name_len,
-				       name_hash);
+	return tuple_field_raw_by_full_path(tuple_format(tuple), tuple_data(tuple),
+					    tuple_field_map(tuple),
+					    path, path_len, path_hash);
 }
 
 /**
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 214760247..4d10b0918 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -879,15 +879,7 @@ tuple_field_go_to_key(const char **field, const char *key, int len)
 	return -1;
 }
 
-/**
- * Retrieve msgpack data by JSON path.
- * @param data Pointer to msgpack with data.
- * @param path The path to process.
- * @param path_len The length of the @path.
- * @retval 0 On success.
- * @retval >0 On path parsing error, invalid character position.
- */
-static int
+int
 tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len)
 {
 	int rc;
@@ -911,14 +903,13 @@ tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len)
 			return 0;
 		}
 	}
-	return rc;
+	return rc != 0 ? -1 : 0;
 }
 
-int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
-                        const uint32_t *field_map, const char *path,
-                        uint32_t path_len, uint32_t path_hash,
-                        const char **field)
+const char *
+tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
+			     const uint32_t *field_map, const char *path,
+			     uint32_t path_len, uint32_t path_hash)
 {
 	assert(path_len > 0);
 	uint32_t fieldno;
@@ -929,22 +920,16 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 	 * use the path as a field name.
 	 */
 	if (tuple_fieldno_by_name(format->dict, path, path_len, path_hash,
-				  &fieldno) == 0) {
-		*field = tuple_field_raw(format, tuple, field_map, fieldno);
-		return 0;
-	}
+				  &fieldno) == 0)
+		return tuple_field_raw(format, tuple, field_map, fieldno);
 	struct json_lexer lexer;
 	struct json_token token;
 	json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
-	int rc = json_lexer_next_token(&lexer, &token);
-	if (rc != 0)
-		goto error;
+	if (json_lexer_next_token(&lexer, &token) != 0)
+		return NULL;
 	switch(token.type) {
 	case JSON_TOKEN_NUM: {
-		int index = token.num;
-		*field = tuple_field_raw(format, tuple, field_map, index);
-		if (*field == NULL)
-			return 0;
+		fieldno = token.num;
 		break;
 	}
 	case JSON_TOKEN_STR: {
@@ -961,28 +946,16 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			 */
 			name_hash = field_name_hash(token.str, token.len);
 		}
-		*field = tuple_field_raw_by_name(format, tuple, field_map,
-						 token.str, token.len,
-						 name_hash);
-		if (*field == NULL)
-			return 0;
+		if (tuple_fieldno_by_name(format->dict, token.str, token.len,
+					  name_hash, &fieldno) != 0)
+			return NULL;
 		break;
 	}
 	default:
 		assert(token.type == JSON_TOKEN_END);
-		*field = NULL;
-		return 0;
+		return NULL;
 	}
-	rc = tuple_field_go_to_path(field, path + lexer.offset,
-				    path_len - lexer.offset);
-	if (rc == 0)
-		return 0;
-	/* Setup absolute error position. */
-	rc += lexer.offset;
-
-error:
-	assert(rc > 0);
-	diag_set(ClientError, ER_ILLEGAL_PARAMS,
-		 tt_sprintf("error in path on position %d", rc));
-	return -1;
+	return tuple_field_raw_by_path(format, tuple, field_map, fieldno,
+				       path + lexer.offset,
+				       path_len - lexer.offset);
 }
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index 0d91119db..60b019194 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -374,88 +374,101 @@ tuple_init_field_map(struct tuple_format *format, uint32_t *field_map,
 		     const char *tuple, bool validate);
 
 /**
- * Get a field at the specific position in this MessagePack array.
- * Returns a pointer to MessagePack data.
- * @param format tuple format
- * @param tuple a pointer to MessagePack array
- * @param field_map a pointer to the LAST element of field map
- * @param field_no the index of field to return
- *
- * @returns field data if field exists or NULL
- * @sa tuple_init_field_map()
+ * Retrieve msgpack data by JSON path.
+ * @param data[in, out] Pointer to msgpack with data.
+ *                      If the field cannot be retrieved be the
+ *                      specified path @path, it is overwritten
+ *                      with NULL.
+ * @param path The path to process.
+ * @param path_len The length of the @path.
+ * @retval 0 On success.
+ * @retval -1 In case of error in JSON path.
+ */
+int
+tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len);
+
+/**
+ * Get tuple field by field index and relative JSON path.
+ * @param format Tuple format.
+ * @param tuple MessagePack tuple's body.
+ * @param field_map Tuple field map.
+ * @param path Relative JSON path to field.
+ * @param path_len Length of @a path.
  */
 static inline const char *
-tuple_field_raw(struct tuple_format *format, const char *tuple,
-		const uint32_t *field_map, uint32_t field_no)
+tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
+			const uint32_t *field_map, uint32_t fieldno,
+			const char *path, uint32_t path_len)
 {
-	if (likely(field_no < format->index_field_count)) {
-		/* Indexed field */
-
-		if (field_no == 0) {
+	if (likely(fieldno < format->index_field_count)) {
+		if (fieldno == 0) {
 			mp_decode_array(&tuple);
-			return tuple;
-		}
-
-		int32_t offset_slot = tuple_format_field(format,
-					field_no)->offset_slot;
-		if (offset_slot != TUPLE_OFFSET_SLOT_NIL) {
-			if (field_map[offset_slot] != 0)
-				return tuple + field_map[offset_slot];
-			else
-				return NULL;
+			goto parse_path;
 		}
+		struct tuple_field *field = tuple_format_field(format, fieldno);
+		assert(field != NULL);
+		int32_t offset_slot = field->offset_slot;
+		if (offset_slot == TUPLE_OFFSET_SLOT_NIL)
+			goto parse;
+		/* Indexed field */
+		if (field_map[offset_slot] == 0)
+			return NULL;
+		tuple += field_map[offset_slot];
+	} else {
+		uint32_t field_count;
+parse:
+		ERROR_INJECT(ERRINJ_TUPLE_FIELD, return NULL);
+		field_count = mp_decode_array(&tuple);
+		if (unlikely(fieldno >= field_count))
+			return NULL;
+		for (uint32_t k = 0; k < fieldno; k++)
+			mp_next(&tuple);
 	}
-	ERROR_INJECT(ERRINJ_TUPLE_FIELD, return NULL);
-	uint32_t field_count = mp_decode_array(&tuple);
-	if (unlikely(field_no >= field_count))
+parse_path:
+	if (path != NULL &&
+	    unlikely(tuple_field_go_to_path(&tuple, path, path_len) != 0))
 		return NULL;
-	for (uint32_t k = 0; k < field_no; k++)
-		mp_next(&tuple);
 	return tuple;
 }
 
 /**
- * Get tuple field by its name.
- * @param format Tuple format.
- * @param tuple MessagePack tuple's body.
- * @param field_map Tuple field map.
- * @param name Field name.
- * @param name_len Length of @a name.
- * @param name_hash Hash of @a name.
+ * Get a field at the specific position in this MessagePack array.
+ * Returns a pointer to MessagePack data.
+ * @param format tuple format
+ * @param tuple a pointer to MessagePack array
+ * @param field_map a pointer to the LAST element of field map
+ * @param field_no the index of field to return
  *
- * @retval not NULL MessagePack field.
- * @retval     NULL No field with @a name.
+ * @returns field data if field exists or NULL
+ * @sa tuple_init_field_map()
  */
 static inline const char *
-tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
-			const uint32_t *field_map, const char *name,
-			uint32_t name_len, uint32_t name_hash)
+tuple_field_raw(struct tuple_format *format, const char *tuple,
+		const uint32_t *field_map, uint32_t field_no)
 {
-	uint32_t fieldno;
-	if (tuple_fieldno_by_name(format->dict, name, name_len, name_hash,
-				  &fieldno) != 0)
-		return NULL;
-	return tuple_field_raw(format, tuple, field_map, fieldno);
+	return tuple_field_raw_by_path(format, tuple, field_map, field_no,
+				       NULL, 0);
 }
 
 /**
- * Get tuple field by its path.
+ * Get tuple field by full JSON path.
+ * Unlike tuple_field_raw_by_path this function works with full
+ * JSON paths, performing root field index resolve on its own.
+ * When the first JSON path token has JSON_TOKEN_STR type, routine
+ * uses tuple format dictionary to get field index by field name.
  * @param format Tuple format.
  * @param tuple MessagePack tuple's body.
  * @param field_map Tuple field map.
- * @param path Field path.
+ * @param path Full JSON path to field.
  * @param path_len Length of @a path.
  * @param path_hash Hash of @a path.
- * @param[out] field Found field, or NULL, if not found.
  *
- * @retval  0 Success.
- * @retval -1 Error in JSON path.
+ * @retval field data if field exists or NULL
  */
-int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
-                        const uint32_t *field_map, const char *path,
-                        uint32_t path_len, uint32_t path_hash,
-                        const char **field);
+const char *
+tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
+			     const uint32_t *field_map, const char *path,
+			     uint32_t path_len, uint32_t path_hash);
 
 /**
  * Get a tuple field pointed to by an index part.
diff --git a/test/engine/tuple.result b/test/engine/tuple.result
index 7ca3985c7..daf57d1f5 100644
--- a/test/engine/tuple.result
+++ b/test/engine/tuple.result
@@ -823,7 +823,7 @@ t[0]
 ...
 t["[0]"]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t["[1000]"]
 ---
@@ -847,7 +847,7 @@ t["[2][6].key100"]
 ...
 t["[2][0]"] -- 0-based index in array.
 ---
-- error: Illegal parameters, error in path on position 5
+- null
 ...
 t["[4][3]"] -- Can not index string.
 ---
@@ -866,27 +866,27 @@ t["a.b.c d.e.f"]
 -- Sytax errors.
 t["[2].[5]"]
 ---
-- error: Illegal parameters, error in path on position 5
+- null
 ...
 t["[-1]"]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t[".."]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t["[["]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t["]]"]
 ---
-- error: Illegal parameters, error in path on position 1
+- null
 ...
 t["{"]
 ---
-- error: Illegal parameters, error in path on position 1
+- null
 ...
 s:drop()
 ---
-- 
2.20.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 2/6] box: introduce tuple_field_raw_by_path routine
  2019-02-03 10:20 ` [PATCH v9 2/6] box: introduce tuple_field_raw_by_path routine Kirill Shcherbatov
@ 2019-02-04 10:37   ` Vladimir Davydov
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Davydov @ 2019-02-04 10:37 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Sun, Feb 03, 2019 at 01:20:22PM +0300, Kirill Shcherbatov wrote:
> Introduced a new function tuple_field_raw_by_path is used to get
> tuple fields by field index and relative JSON path. This routine
> uses tuple_format's field_map if possible. It will be further
> extended to use JSON indexes.
> The old tuple_field_raw_by_path routine used to work with full
> JSON paths, renamed tuple_field_raw_by_full_path. It's return
> value type is changed to const char * because the other similar
> functions tuple_field_raw and tuple_field_by_part_raw use this
> convention.
> Got rid of reporting error position for 'invalid JSON path' error
> in lbox_tuple_field_by_path because we can't extend other
> routines to behave such way that makes an API inconsistent,
> moreover such error are useless and confusing.
> 
> Needed for #1012
> ---
>  src/box/lua/tuple.c      |   9 +--
>  src/box/tuple.h          |  60 +++++++++---------
>  src/box/tuple_format.c   |  63 ++++++-------------
>  src/box/tuple_format.h   | 127 +++++++++++++++++++++------------------
>  test/engine/tuple.result |  16 ++---
>  5 files changed, 129 insertions(+), 146 deletions(-)

Pushed to 2.1.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 3/6] box: introduce JSON Indexes
  2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov
  2019-02-03 10:20 ` [PATCH v9 1/6] lib: update msgpuck library Kirill Shcherbatov
  2019-02-03 10:20 ` [PATCH v9 2/6] box: introduce tuple_field_raw_by_path routine Kirill Shcherbatov
@ 2019-02-03 10:20 ` Kirill Shcherbatov
  2019-02-04 12:26   ` Vladimir Davydov
  2019-02-03 10:20 ` [PATCH v9 4/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-03 10:20 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

New JSON indexes allows to index documents content.
At first, introduced new key_part fields path and path_len
representing JSON path string specified by user. Modified
tuple_format_use_key_part routine constructs corresponding
tuple_fields chain in tuple_format::fields tree to indexed data.
The resulting tree is used for type checking and for alloctating
indexed fields offset slots.

Then refined tuple_init_field_map routine logic parses tuple
msgpack in depth using stack allocated on region and initialize
field map with corresponding tuple_format::field if any.
Finally, to proceed memory allocation for vinyl's secondary key
restored by extracted keys loaded from disc without fields
tree traversal, introduced format::min_tuple_size field - the
size of tuple_format tuple as if all leaf fields are zero.

Example:
To create a new JSON index specify path to document data as a
part of key_part:
parts = {{3, 'str', path = '.FIO.fname', is_nullable = false}}
idx = s:create_index('json_idx', {parts = parse})
idx:select("Ivanov")

Part of #1012
---
 src/box/alter.cc             |   2 +-
 src/box/index_def.c          |  10 +-
 src/box/key_def.c            | 157 ++++++++--
 src/box/key_def.h            |  34 +-
 src/box/lua/space.cc         |   5 +
 src/box/memtx_engine.c       |   4 +
 src/box/sql.c                |   1 +
 src/box/sql/build.c          |   1 +
 src/box/sql/select.c         |   3 +-
 src/box/sql/where.c          |   1 +
 src/box/tuple_compare.cc     |   7 +-
 src/box/tuple_extract_key.cc |  35 ++-
 src/box/tuple_format.c       | 374 ++++++++++++++++++----
 src/box/tuple_format.h       |  60 +++-
 src/box/tuple_hash.cc        |   2 +-
 src/box/vinyl.c              |   4 +
 src/box/vy_log.c             |  61 +++-
 src/box/vy_point_lookup.c    |   4 +-
 src/box/vy_stmt.c            | 202 +++++++++---
 test/engine/json.result      | 591 +++++++++++++++++++++++++++++++++++
 test/engine/json.test.lua    | 167 ++++++++++
 21 files changed, 1546 insertions(+), 179 deletions(-)
 create mode 100644 test/engine/json.result
 create mode 100644 test/engine/json.test.lua

diff --git a/src/box/alter.cc b/src/box/alter.cc
index 0589c9678..9656a4189 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -268,7 +268,7 @@ index_def_new_from_tuple(struct tuple *tuple, struct space *space)
 	});
 	if (key_def_decode_parts(part_def, part_count, &parts,
 				 space->def->fields,
-				 space->def->field_count) != 0)
+				 space->def->field_count, &fiber()->gc) != 0)
 		diag_raise();
 	key_def = key_def_new(part_def, part_count);
 	if (key_def == NULL)
diff --git a/src/box/index_def.c b/src/box/index_def.c
index 2ba57ee9d..c52aa38d1 100644
--- a/src/box/index_def.c
+++ b/src/box/index_def.c
@@ -31,6 +31,8 @@
 #include "index_def.h"
 #include "schema_def.h"
 #include "identifier.h"
+#include "tuple_format.h"
+#include "json/json.h"
 
 const char *index_type_strs[] = { "HASH", "TREE", "BITSET", "RTREE" };
 
@@ -278,8 +280,12 @@ index_def_is_valid(struct index_def *index_def, const char *space_name)
 			 * Courtesy to a user who could have made
 			 * a typo.
 			 */
-			if (index_def->key_def->parts[i].fieldno ==
-			    index_def->key_def->parts[j].fieldno) {
+			struct key_part *part_a = &index_def->key_def->parts[i];
+			struct key_part *part_b = &index_def->key_def->parts[j];
+			if (part_a->fieldno == part_b->fieldno &&
+			    json_path_cmp(part_a->path, part_a->path_len,
+					  part_b->path, part_b->path_len,
+					  TUPLE_INDEX_BASE) == 0) {
 				diag_set(ClientError, ER_MODIFY_INDEX,
 					 index_def->name, space_name,
 					 "same key part is indexed twice");
diff --git a/src/box/key_def.c b/src/box/key_def.c
index dae3580e2..1b00945ca 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -28,6 +28,7 @@
  * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
+#include "json/json.h"
 #include "key_def.h"
 #include "tuple_compare.h"
 #include "tuple_extract_key.h"
@@ -35,6 +36,7 @@
 #include "column_mask.h"
 #include "schema_def.h"
 #include "coll_id_cache.h"
+#include "small/region.h"
 
 const char *sort_order_strs[] = { "asc", "desc", "undef" };
 
@@ -44,7 +46,8 @@ const struct key_part_def key_part_def_default = {
 	COLL_NONE,
 	false,
 	ON_CONFLICT_ACTION_DEFAULT,
-	SORT_ORDER_ASC
+	SORT_ORDER_ASC,
+	NULL
 };
 
 static int64_t
@@ -59,6 +62,7 @@ part_type_by_name_wrapper(const char *str, uint32_t len)
 #define PART_OPT_NULLABILITY	 "is_nullable"
 #define PART_OPT_NULLABLE_ACTION "nullable_action"
 #define PART_OPT_SORT_ORDER	 "sort_order"
+#define PART_OPT_PATH		 "path"
 
 const struct opt_def part_def_reg[] = {
 	OPT_DEF_ENUM(PART_OPT_TYPE, field_type, struct key_part_def, type,
@@ -71,19 +75,33 @@ const struct opt_def part_def_reg[] = {
 		     struct key_part_def, nullable_action, NULL),
 	OPT_DEF_ENUM(PART_OPT_SORT_ORDER, sort_order, struct key_part_def,
 		     sort_order, NULL),
+	OPT_DEF(PART_OPT_PATH, OPT_STRPTR, struct key_part_def, path),
 	OPT_END,
 };
 
 struct key_def *
 key_def_dup(const struct key_def *src)
 {
-	size_t sz = key_def_sizeof(src->part_count);
+	size_t sz = 0;
+	for (uint32_t i = 0; i < src->part_count; i++)
+		sz += src->parts[i].path_len;
+	sz = key_def_sizeof(src->part_count, sz);
 	struct key_def *res = (struct key_def *)malloc(sz);
 	if (res == NULL) {
 		diag_set(OutOfMemory, sz, "malloc", "res");
 		return NULL;
 	}
 	memcpy(res, src, sz);
+	/*
+	 * Update the paths pointers so that they refer to the
+	 * JSON strings bytes in the new allocation.
+	 */
+	for (uint32_t i = 0; i < src->part_count; i++) {
+		if (src->parts[i].path == NULL)
+			continue;
+		size_t path_offset = src->parts[i].path - (char *)src;
+		res->parts[i].path = (char *)res + path_offset;
+	}
 	return res;
 }
 
@@ -91,8 +109,16 @@ void
 key_def_swap(struct key_def *old_def, struct key_def *new_def)
 {
 	assert(old_def->part_count == new_def->part_count);
-	for (uint32_t i = 0; i < new_def->part_count; i++)
+	for (uint32_t i = 0; i < new_def->part_count; i++) {
 		SWAP(old_def->parts[i], new_def->parts[i]);
+		/*
+		 * Paths are allocated as a part of key_def so
+		 * we need to swap path pointers back - it's OK
+		 * as paths aren't supposed to change.
+		 */
+		assert(old_def->parts[i].path_len == new_def->parts[i].path_len);
+		SWAP(old_def->parts[i].path, new_def->parts[i].path);
+	}
 	SWAP(*old_def, *new_def);
 }
 
@@ -115,24 +141,39 @@ static void
 key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 		 enum field_type type, enum on_conflict_action nullable_action,
 		 struct coll *coll, uint32_t coll_id,
-		 enum sort_order sort_order)
+		 enum sort_order sort_order, const char *path,
+		 uint32_t path_len, char **path_pool)
 {
 	assert(part_no < def->part_count);
 	assert(type < field_type_MAX);
 	def->is_nullable |= (nullable_action == ON_CONFLICT_ACTION_NONE);
+	def->has_json_paths |= path != NULL;
 	def->parts[part_no].nullable_action = nullable_action;
 	def->parts[part_no].fieldno = fieldno;
 	def->parts[part_no].type = type;
 	def->parts[part_no].coll = coll;
 	def->parts[part_no].coll_id = coll_id;
 	def->parts[part_no].sort_order = sort_order;
+	if (path != NULL) {
+		assert(path_pool != NULL);
+		def->parts[part_no].path = *path_pool;
+		*path_pool += path_len;
+		memcpy(def->parts[part_no].path, path, path_len);
+		def->parts[part_no].path_len = path_len;
+	} else {
+		def->parts[part_no].path = NULL;
+		def->parts[part_no].path_len = 0;
+	}
 	column_mask_set_fieldno(&def->column_mask, fieldno);
 }
 
 struct key_def *
 key_def_new(const struct key_part_def *parts, uint32_t part_count)
 {
-	size_t sz = key_def_sizeof(part_count);
+	size_t sz = 0;
+	for (uint32_t i = 0; i < part_count; i++)
+		sz += parts[i].path != NULL ? strlen(parts[i].path) : 0;
+	sz = key_def_sizeof(part_count, sz);
 	struct key_def *def = calloc(1, sz);
 	if (def == NULL) {
 		diag_set(OutOfMemory, sz, "malloc", "struct key_def");
@@ -142,6 +183,8 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count)
 	def->part_count = part_count;
 	def->unique_part_count = part_count;
 
+	/* A pointer to the JSON paths data in the new key_def. */
+	char *path_pool = (char *)def + key_def_sizeof(part_count, 0);
 	for (uint32_t i = 0; i < part_count; i++) {
 		const struct key_part_def *part = &parts[i];
 		struct coll *coll = NULL;
@@ -155,16 +198,20 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count)
 			}
 			coll = coll_id->coll;
 		}
+		uint32_t path_len = part->path != NULL ? strlen(part->path) : 0;
 		key_def_set_part(def, i, part->fieldno, part->type,
 				 part->nullable_action, coll, part->coll_id,
-				 part->sort_order);
+				 part->sort_order, part->path, path_len,
+				 &path_pool);
 	}
+	assert(path_pool == (char *)def + sz);
 	key_def_set_cmp(def);
 	return def;
 }
 
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
+int
+key_def_dump_parts(const struct key_def *def, struct key_part_def *parts,
+		   struct region *region)
 {
 	for (uint32_t i = 0; i < def->part_count; i++) {
 		const struct key_part *part = &def->parts[i];
@@ -174,13 +221,27 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
 		part_def->is_nullable = key_part_is_nullable(part);
 		part_def->nullable_action = part->nullable_action;
 		part_def->coll_id = part->coll_id;
+		if (part->path != NULL) {
+			char *path = region_alloc(region, part->path_len + 1);
+			if (path == NULL) {
+				diag_set(OutOfMemory, part->path_len + 1,
+					 "region", "part_def->path");
+				return -1;
+			}
+			memcpy(path, part->path, part->path_len);
+			path[part->path_len] = '\0';
+			part_def->path = path;
+		} else {
+			part_def->path = NULL;
+		}
 	}
+	return 0;
 }
 
 box_key_def_t *
 box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 {
-	size_t sz = key_def_sizeof(part_count);
+	size_t sz = key_def_sizeof(part_count, 0);
 	struct key_def *key_def = calloc(1, sz);
 	if (key_def == NULL) {
 		diag_set(OutOfMemory, sz, "malloc", "struct key_def");
@@ -194,7 +255,8 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 		key_def_set_part(key_def, item, fields[item],
 				 (enum field_type)types[item],
 				 ON_CONFLICT_ACTION_DEFAULT,
-				 NULL, COLL_NONE, SORT_ORDER_ASC);
+				 NULL, COLL_NONE, SORT_ORDER_ASC, NULL, 0,
+				 NULL);
 	}
 	key_def_set_cmp(key_def);
 	return key_def;
@@ -243,6 +305,11 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
 		if (key_part_is_nullable(part1) != key_part_is_nullable(part2))
 			return key_part_is_nullable(part1) <
 			       key_part_is_nullable(part2) ? -1 : 1;
+		int rc = json_path_cmp(part1->path, part1->path_len,
+				       part2->path, part2->path_len,
+				       TUPLE_INDEX_BASE);
+		if (rc != 0)
+			return rc;
 	}
 	return part_count1 < part_count2 ? -1 : part_count1 > part_count2;
 }
@@ -253,8 +320,9 @@ key_def_update_optionality(struct key_def *def, uint32_t min_field_count)
 	def->has_optional_parts = false;
 	for (uint32_t i = 0; i < def->part_count; ++i) {
 		struct key_part *part = &def->parts[i];
-		def->has_optional_parts |= key_part_is_nullable(part) &&
-					   min_field_count < part->fieldno + 1;
+		def->has_optional_parts |=
+			(min_field_count < part->fieldno + 1 ||
+			 part->path != NULL) && key_part_is_nullable(part);
 		/*
 		 * One optional part is enough to switch to new
 		 * comparators.
@@ -274,8 +342,13 @@ key_def_snprint_parts(char *buf, int size, const struct key_part_def *parts,
 	for (uint32_t i = 0; i < part_count; i++) {
 		const struct key_part_def *part = &parts[i];
 		assert(part->type < field_type_MAX);
-		SNPRINT(total, snprintf, buf, size, "%d, '%s'",
+		SNPRINT(total, snprintf, buf, size, "[%d, '%s'",
 			(int)part->fieldno, field_type_strs[part->type]);
+		if (part->path != NULL) {
+			SNPRINT(total, snprintf, buf, size, ", path='%s'",
+				part->path);
+		}
+		SNPRINT(total, snprintf, buf, size, "]");
 		if (i < part_count - 1)
 			SNPRINT(total, snprintf, buf, size, ", ");
 	}
@@ -294,6 +367,8 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
 			count++;
 		if (part->is_nullable)
 			count++;
+		if (part->path != NULL)
+			count++;
 		size += mp_sizeof_map(count);
 		size += mp_sizeof_str(strlen(PART_OPT_FIELD));
 		size += mp_sizeof_uint(part->fieldno);
@@ -308,6 +383,10 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
 			size += mp_sizeof_str(strlen(PART_OPT_NULLABILITY));
 			size += mp_sizeof_bool(part->is_nullable);
 		}
+		if (part->path != NULL) {
+			size += mp_sizeof_str(strlen(PART_OPT_PATH));
+			size += mp_sizeof_str(strlen(part->path));
+		}
 	}
 	return size;
 }
@@ -323,6 +402,8 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
 			count++;
 		if (part->is_nullable)
 			count++;
+		if (part->path != NULL)
+			count++;
 		data = mp_encode_map(data, count);
 		data = mp_encode_str(data, PART_OPT_FIELD,
 				     strlen(PART_OPT_FIELD));
@@ -342,6 +423,12 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
 					     strlen(PART_OPT_NULLABILITY));
 			data = mp_encode_bool(data, part->is_nullable);
 		}
+		if (part->path != NULL) {
+			data = mp_encode_str(data, PART_OPT_PATH,
+					     strlen(PART_OPT_PATH));
+			data = mp_encode_str(data, part->path,
+					     strlen(part->path));
+		}
 	}
 	return data;
 }
@@ -403,6 +490,7 @@ key_def_decode_parts_166(struct key_part_def *parts, uint32_t part_count,
 				     fields[part->fieldno].is_nullable :
 				     key_part_def_default.is_nullable);
 		part->coll_id = COLL_NONE;
+		part->path = NULL;
 	}
 	return 0;
 }
@@ -410,7 +498,7 @@ key_def_decode_parts_166(struct key_part_def *parts, uint32_t part_count,
 int
 key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 		     const char **data, const struct field_def *fields,
-		     uint32_t field_count)
+		     uint32_t field_count, struct region *region)
 {
 	if (mp_typeof(**data) == MP_ARRAY) {
 		return key_def_decode_parts_166(parts, part_count, data,
@@ -439,7 +527,7 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 			const char *key = mp_decode_str(data, &key_len);
 			if (opts_parse_key(part, part_def_reg, key, key_len, data,
 					   ER_WRONG_INDEX_OPTIONS,
-					   i + TUPLE_INDEX_BASE, NULL,
+					   i + TUPLE_INDEX_BASE, region,
 					   false) != 0)
 				return -1;
 			if (is_action_missing &&
@@ -485,6 +573,13 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 				 "index part: unknown sort order");
 			return -1;
 		}
+		if (part->path != NULL &&
+		    json_path_validate(part->path, strlen(part->path),
+				       TUPLE_INDEX_BASE) != 0) {
+			diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+				 i + TUPLE_INDEX_BASE, "invalid path");
+			return -1;
+		}
 	}
 	return 0;
 }
@@ -504,7 +599,10 @@ key_def_find(const struct key_def *key_def, const struct key_part *to_find)
 	const struct key_part *part = key_def->parts;
 	const struct key_part *end = part + key_def->part_count;
 	for (; part != end; part++) {
-		if (part->fieldno == to_find->fieldno)
+		if (part->fieldno == to_find->fieldno &&
+		    json_path_cmp(part->path, part->path_len,
+				  to_find->path, to_find->path_len,
+				  TUPLE_INDEX_BASE) == 0)
 			return part;
 	}
 	return NULL;
@@ -530,18 +628,25 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	 * Find and remove part duplicates, i.e. parts counted
 	 * twice since they are present in both key defs.
 	 */
-	const struct key_part *part = second->parts;
-	const struct key_part *end = part + second->part_count;
+	size_t sz = 0;
+	const struct key_part *part = first->parts;
+	const struct key_part *end = part + first->part_count;
+	for (; part != end; part++)
+		sz += part->path_len;
+	part = second->parts;
+	end = part + second->part_count;
 	for (; part != end; part++) {
 		if (key_def_find(first, part) != NULL)
 			--new_part_count;
+		else
+			sz += part->path_len;
 	}
 
+	sz = key_def_sizeof(new_part_count, sz);
 	struct key_def *new_def;
-	new_def =  (struct key_def *)calloc(1, key_def_sizeof(new_part_count));
+	new_def = (struct key_def *)calloc(1, sz);
 	if (new_def == NULL) {
-		diag_set(OutOfMemory, key_def_sizeof(new_part_count), "malloc",
-			 "new_def");
+		diag_set(OutOfMemory, sz, "malloc", "new_def");
 		return NULL;
 	}
 	new_def->part_count = new_part_count;
@@ -549,6 +654,9 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	new_def->is_nullable = first->is_nullable || second->is_nullable;
 	new_def->has_optional_parts = first->has_optional_parts ||
 				      second->has_optional_parts;
+
+	/* JSON paths data in the new key_def. */
+	char *path_pool = (char *)new_def + key_def_sizeof(new_part_count, 0);
 	/* Write position in the new key def. */
 	uint32_t pos = 0;
 	/* Append first key def's parts to the new index_def. */
@@ -557,7 +665,8 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	for (; part != end; part++) {
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
-				 part->coll_id, part->sort_order);
+				 part->coll_id, part->sort_order, part->path,
+				 part->path_len, &path_pool);
 	}
 
 	/* Set-append second key def's part to the new key def. */
@@ -568,8 +677,10 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 			continue;
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
-				 part->coll_id, part->sort_order);
+				 part->coll_id, part->sort_order, part->path,
+				 part->path_len, &path_pool);
 	}
+	assert(path_pool == (char *)new_def + sz);
 	key_def_set_cmp(new_def);
 	return new_def;
 }
diff --git a/src/box/key_def.h b/src/box/key_def.h
index d1866303b..678d1f070 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -64,6 +64,12 @@ struct key_part_def {
 	enum on_conflict_action nullable_action;
 	/** Part sort order. */
 	enum sort_order sort_order;
+	/**
+	 * JSON path to indexed data, relative to the field number,
+	 * or NULL if this key part indexes a top-level field.
+	 * This sting is 0-terminated.
+	 */
+	const char *path;
 };
 
 extern const struct key_part_def key_part_def_default;
@@ -82,6 +88,15 @@ struct key_part {
 	enum on_conflict_action nullable_action;
 	/** Part sort order. */
 	enum sort_order sort_order;
+	/**
+	 * JSON path to indexed data, relative to the field number,
+	 * or NULL if this key part index a top-level field.
+	 * This string is not 0-terminated. String memory is
+	 * allocated at the end of key_def.
+	 */
+	char *path;
+	/** The length of JSON path. */
+	uint32_t path_len;
 };
 
 struct key_def;
@@ -148,6 +163,8 @@ struct key_def {
 	uint32_t unique_part_count;
 	/** True, if at least one part can store NULL. */
 	bool is_nullable;
+	/** True if some key part has JSON path. */
+	bool has_json_paths;
 	/**
 	 * True, if some key parts can be absent in a tuple. These
 	 * fields assumed to be MP_NIL.
@@ -241,9 +258,10 @@ box_tuple_compare_with_key(const box_tuple_t *tuple_a, const char *key_b,
 /** \endcond public */
 
 static inline size_t
-key_def_sizeof(uint32_t part_count)
+key_def_sizeof(uint32_t part_count, uint32_t path_pool_size)
 {
-	return sizeof(struct key_def) + sizeof(struct key_part) * part_count;
+	return sizeof(struct key_def) + sizeof(struct key_part) * part_count +
+	       path_pool_size;
 }
 
 /**
@@ -255,9 +273,12 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count);
 
 /**
  * Dump part definitions of the given key def.
+ * The region is used for allocating JSON paths, if any.
+ * Return -1 on memory allocation error, 0 on success.
  */
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
+int
+key_def_dump_parts(const struct key_def *def, struct key_part_def *parts,
+		   struct region *region);
 
 /**
  * Update 'has_optional_parts' of @a key_def with correspondence
@@ -299,11 +320,12 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
  *  [NUM, STR, ..][NUM, STR, ..]..,
  *  OR
  *  {field=NUM, type=STR, ..}{field=NUM, type=STR, ..}..,
+ * The region is used for allocating JSON paths, if any.
  */
 int
 key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 		     const char **data, const struct field_def *fields,
-		     uint32_t field_count);
+		     uint32_t field_count, struct region *region);
 
 /**
  * Returns the part in index_def->parts for the specified fieldno.
@@ -364,6 +386,8 @@ key_validate_parts(const struct key_def *key_def, const char *key,
 static inline bool
 key_def_is_sequential(const struct key_def *key_def)
 {
+	if (key_def->has_json_paths)
+		return false;
 	for (uint32_t part_id = 0; part_id < key_def->part_count; part_id++) {
 		if (key_def->parts[part_id].fieldno != part_id)
 			return false;
diff --git a/src/box/lua/space.cc b/src/box/lua/space.cc
index 7cae436f1..1f152917e 100644
--- a/src/box/lua/space.cc
+++ b/src/box/lua/space.cc
@@ -296,6 +296,11 @@ lbox_fillspace(struct lua_State *L, struct space *space, int i)
 			lua_pushnumber(L, part->fieldno + TUPLE_INDEX_BASE);
 			lua_setfield(L, -2, "fieldno");
 
+			if (part->path != NULL) {
+				lua_pushlstring(L, part->path, part->path_len);
+				lua_setfield(L, -2, "path");
+			}
+
 			lua_pushboolean(L, key_part_is_nullable(part));
 			lua_setfield(L, -2, "is_nullable");
 
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 692e41efb..64f43456e 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -1312,6 +1312,10 @@ memtx_index_def_change_requires_rebuild(struct index *index,
 			return true;
 		if (old_part->coll != new_part->coll)
 			return true;
+		if (json_path_cmp(old_part->path, old_part->path_len,
+				  new_part->path, new_part->path_len,
+				  TUPLE_INDEX_BASE) != 0)
+			return true;
 	}
 	return false;
 }
diff --git a/src/box/sql.c b/src/box/sql.c
index 387da7b3d..94fd7e369 100644
--- a/src/box/sql.c
+++ b/src/box/sql.c
@@ -386,6 +386,7 @@ sql_ephemeral_space_create(uint32_t field_count, struct sql_key_info *key_info)
 		part->nullable_action = ON_CONFLICT_ACTION_NONE;
 		part->is_nullable = true;
 		part->sort_order = SORT_ORDER_ASC;
+		part->path = NULL;
 		if (def != NULL && i < def->part_count)
 			part->coll_id = def->parts[i].coll_id;
 		else
diff --git a/src/box/sql/build.c b/src/box/sql/build.c
index 49b90b5d0..947daf8f6 100644
--- a/src/box/sql/build.c
+++ b/src/box/sql/build.c
@@ -2185,6 +2185,7 @@ index_fill_def(struct Parse *parse, struct index *index,
 		part->is_nullable = part->nullable_action == ON_CONFLICT_ACTION_NONE;
 		part->sort_order = SORT_ORDER_ASC;
 		part->coll_id = coll_id;
+		part->path = NULL;
 	}
 	key_def = key_def_new(key_parts, expr_list->nExpr);
 	if (key_def == NULL)
diff --git a/src/box/sql/select.c b/src/box/sql/select.c
index 02ee225f1..3f136a342 100644
--- a/src/box/sql/select.c
+++ b/src/box/sql/select.c
@@ -1360,6 +1360,7 @@ sql_key_info_new(sqlite3 *db, uint32_t part_count)
 		part->is_nullable = false;
 		part->nullable_action = ON_CONFLICT_ACTION_ABORT;
 		part->sort_order = SORT_ORDER_ASC;
+		part->path = NULL;
 	}
 	return key_info;
 }
@@ -1377,7 +1378,7 @@ sql_key_info_new_from_key_def(sqlite3 *db, const struct key_def *key_def)
 	key_info->key_def = NULL;
 	key_info->refs = 1;
 	key_info->part_count = key_def->part_count;
-	key_def_dump_parts(key_def, key_info->parts);
+	key_def_dump_parts(key_def, key_info->parts, NULL);
 	return key_info;
 }
 
diff --git a/src/box/sql/where.c b/src/box/sql/where.c
index 571b5af78..814bd3926 100644
--- a/src/box/sql/where.c
+++ b/src/box/sql/where.c
@@ -2807,6 +2807,7 @@ whereLoopAddBtree(WhereLoopBuilder * pBuilder,	/* WHERE clause information */
 		part.is_nullable = false;
 		part.sort_order = SORT_ORDER_ASC;
 		part.coll_id = COLL_NONE;
+		part.path = NULL;
 
 		struct key_def *key_def = key_def_new(&part, 1);
 		if (key_def == NULL) {
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 3fe4cae32..7ab6e3bf6 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -469,7 +469,8 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
-	if (key_def->part_count == 1 && part->fieldno == 0) {
+	if (key_def->part_count == 1 && part->fieldno == 0 &&
+	    part->path == NULL) {
 		/*
 		 * First field can not be optional - empty tuples
 		 * can not exist.
@@ -1027,7 +1028,7 @@ tuple_compare_create(const struct key_def *def)
 		}
 	}
 	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def)) {
+	if (!key_def_has_collation(def) && !def->has_json_paths) {
 		/* Precalculated comparators don't use collation */
 		for (uint32_t k = 0;
 		     k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
@@ -1247,7 +1248,7 @@ tuple_compare_with_key_create(const struct key_def *def)
 		}
 	}
 	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def)) {
+	if (!key_def_has_collation(def) && !def->has_json_paths) {
 		/* Precalculated comparators don't use collation */
 		for (uint32_t k = 0;
 		     k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index ac8b5a44e..1e8ec7588 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -8,9 +8,10 @@ enum { MSGPACK_NULL = 0xc0 };
 static inline bool
 key_def_parts_are_sequential(const struct key_def *def, int i)
 {
-	uint32_t fieldno1 = def->parts[i].fieldno + 1;
-	uint32_t fieldno2 = def->parts[i + 1].fieldno;
-	return fieldno1 == fieldno2;
+	const struct key_part *part1 = &def->parts[i];
+	const struct key_part *part2 = &def->parts[i + 1];
+	return part1->fieldno + 1 == part2->fieldno &&
+	       part1->path == NULL && part2->path == NULL;
 }
 
 /** True, if a key con contain two or more parts in sequence. */
@@ -241,7 +242,8 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 			if (!key_def_parts_are_sequential(key_def, i))
 				break;
 		}
-		uint32_t end_fieldno = key_def->parts[i].fieldno;
+		const struct key_part *part = &key_def->parts[i];
+		uint32_t end_fieldno = part->fieldno;
 
 		if (fieldno < current_fieldno) {
 			/* Rewind. */
@@ -283,8 +285,29 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 				current_fieldno++;
 			}
 		}
-		memcpy(key_buf, field, field_end - field);
-		key_buf += field_end - field;
+		const char *src = field;
+		const char *src_end = field_end;
+		if (part->path != NULL) {
+			if (tuple_field_go_to_path(&src, part->path,
+						   part->path_len) != 0) {
+				/*
+				 * The path must be correct as
+				 * it has already been validated
+				 * in key_def_decode_parts.
+				 */
+				unreachable();
+			}
+			assert(src != NULL || has_optional_parts);
+			if (has_optional_parts && src == NULL) {
+				null_count += 1;
+				src = src_end;
+			} else {
+				src_end = src;
+				mp_next(&src_end);
+			}
+		}
+		memcpy(key_buf, src, src_end - src);
+		key_buf += src_end - src;
 		if (has_optional_parts && null_count != 0) {
 			memset(key_buf, MSGPACK_NULL, null_count);
 			key_buf += null_count * mp_sizeof_nil();
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 4d10b0918..d9c408495 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -42,6 +42,33 @@ static intptr_t recycled_format_ids = FORMAT_ID_NIL;
 
 static uint32_t formats_size = 0, formats_capacity = 0;
 
+/**
+ * Find in format1::fields the field by format2_field's JSON path.
+ * Routine uses fiber region for temporal path allocation and
+ * panics on failure.
+ */
+static struct tuple_field *
+tuple_format1_field_by_format2_field(struct tuple_format *format1,
+				     struct tuple_field *format2_field)
+{
+	struct region *region = &fiber()->gc;
+	size_t region_svp = region_used(region);
+	uint32_t path_len = json_tree_snprint_path(NULL, 0,
+				&format2_field->token, TUPLE_INDEX_BASE);
+	char *path = region_alloc(region, path_len + 1);
+	if (path == NULL)
+		panic("Can not allocate memory for path");
+	json_tree_snprint_path(path, path_len + 1, &format2_field->token,
+			       TUPLE_INDEX_BASE);
+	struct tuple_field *format1_field =
+		json_tree_lookup_path_entry(&format1->fields,
+					    &format1->fields.root, path,
+					    path_len, TUPLE_INDEX_BASE,
+					    struct tuple_field, token);
+	region_truncate(region, region_svp);
+	return format1_field;
+}
+
 static int
 tuple_format_cmp(const struct tuple_format *format1,
 		 const struct tuple_format *format2)
@@ -50,12 +77,14 @@ tuple_format_cmp(const struct tuple_format *format1,
 	struct tuple_format *b = (struct tuple_format *)format2;
 	if (a->exact_field_count != b->exact_field_count)
 		return a->exact_field_count - b->exact_field_count;
-	if (tuple_format_field_count(a) != tuple_format_field_count(b))
-		return tuple_format_field_count(a) - tuple_format_field_count(b);
+	if (a->total_field_count != b->total_field_count)
+		return a->total_field_count - b->total_field_count;
 
-	for (uint32_t i = 0; i < tuple_format_field_count(a); ++i) {
-		struct tuple_field *field_a = tuple_format_field(a, i);
-		struct tuple_field *field_b = tuple_format_field(b, i);
+	struct tuple_field *field_a;
+	json_tree_foreach_entry_preorder(field_a, &a->fields.root,
+					 struct tuple_field, token) {
+		struct tuple_field *field_b =
+			tuple_format1_field_by_format2_field(b, field_a);
 		if (field_a->type != field_b->type)
 			return (int)field_a->type - (int)field_b->type;
 		if (field_a->coll_id != field_b->coll_id)
@@ -82,8 +111,9 @@ tuple_format_hash(struct tuple_format *format)
 	uint32_t h = 13;
 	uint32_t carry = 0;
 	uint32_t size = 0;
-	for (uint32_t i = 0; i < tuple_format_field_count(format); ++i) {
-		struct tuple_field *f = tuple_format_field(format, i);
+	struct tuple_field *f;
+	json_tree_foreach_entry_preorder(f, &format->fields.root,
+					 struct tuple_field, token) {
 		TUPLE_FIELD_MEMBER_HASH(f, type, h, carry, size)
 		TUPLE_FIELD_MEMBER_HASH(f, coll_id, h, carry, size)
 		TUPLE_FIELD_MEMBER_HASH(f, nullable_action, h, carry, size)
@@ -135,9 +165,14 @@ static const char *
 tuple_field_path(const struct tuple_field *field)
 {
 	assert(field->token.parent != NULL);
-	assert(field->token.parent->parent == NULL);
-	assert(field->token.type == JSON_TOKEN_NUM);
-	return int2str(field->token.num + TUPLE_INDEX_BASE);
+	if (field->token.parent->parent == NULL) {
+		/* Top-level field, no need to format the path. */
+		return int2str(field->token.num + TUPLE_INDEX_BASE);
+	}
+	char *path = tt_static_buf();
+	json_tree_snprint_path(path, TT_STATIC_BUF_LEN, &field->token,
+			       TUPLE_INDEX_BASE);
+	return path;
 }
 
 /**
@@ -158,18 +193,109 @@ tuple_format_field_by_id(struct tuple_format *format, uint32_t id)
 	return NULL;
 }
 
+/**
+ * Given a field number and a path, add the corresponding field
+ * to the tuple format, allocating intermediate fields if
+ * necessary.
+ *
+ * Return a pointer to the leaf field on success, NULL on memory
+ * allocation error or type/nullability mistmatch error, diag
+ * message is set.
+ */
+static struct tuple_field *
+tuple_format_add_field(struct tuple_format *format, uint32_t fieldno,
+		       const char *path, uint32_t path_len, char **path_pool)
+{
+	struct tuple_field *field = NULL;
+	struct tuple_field *parent = tuple_format_field(format, fieldno);
+	assert(parent != NULL);
+	if (path == NULL)
+		return parent;
+	field = tuple_field_new();
+	if (field == NULL)
+		goto fail;
+
+	/*
+	 * Retrieve path_len memory chunk from the path_pool and
+	 * copy path data there. This is necessary in order to
+	 * ensure that each new format::tuple_field refer format
+	 * memory.
+	 */
+	memcpy(*path_pool, path, path_len);
+	path = *path_pool;
+	*path_pool += path_len;
+
+	int rc = 0;
+	uint32_t token_count = 0;
+	struct json_tree *tree = &format->fields;
+	struct json_lexer lexer;
+	json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
+	while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 &&
+	       field->token.type != JSON_TOKEN_END) {
+		enum field_type expected_type =
+			field->token.type == JSON_TOKEN_STR ?
+			FIELD_TYPE_MAP : FIELD_TYPE_ARRAY;
+		if (field_type1_contains_type2(parent->type, expected_type)) {
+			parent->type = expected_type;
+		} else {
+			diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
+				 tuple_field_path(parent),
+				 field_type_strs[parent->type],
+				 field_type_strs[expected_type]);
+			goto fail;
+		}
+		struct tuple_field *next =
+			json_tree_lookup_entry(tree, &parent->token,
+					       &field->token,
+					       struct tuple_field, token);
+		if (next == NULL) {
+			field->id = format->total_field_count++;
+			rc = json_tree_add(tree, &parent->token, &field->token);
+			if (rc != 0) {
+				diag_set(OutOfMemory, sizeof(struct json_token),
+					 "json_tree_add", "tree");
+				goto fail;
+			}
+			next = field;
+			field = tuple_field_new();
+			if (field == NULL)
+				goto fail;
+		}
+		parent->is_key_part = true;
+		parent = next;
+		token_count++;
+	}
+	/*
+	 * The path has already been verified by the
+	 * key_def_decode_parts function.
+	 */
+	assert(rc == 0 && field->token.type == JSON_TOKEN_END);
+	assert(parent != NULL);
+	/* Update tree depth information. */
+	format->fields_depth = MAX(format->fields_depth, token_count + 1);
+end:
+	tuple_field_delete(field);
+	return parent;
+fail:
+	parent = NULL;
+	goto end;
+}
+
 static int
 tuple_format_use_key_part(struct tuple_format *format, uint32_t field_count,
 			  const struct key_part *part, bool is_sequential,
-			  int *current_slot)
+			  int *current_slot, char **path_pool)
 {
 	assert(part->fieldno < tuple_format_field_count(format));
-	struct tuple_field *field = tuple_format_field(format, part->fieldno);
+	struct tuple_field *field =
+		tuple_format_add_field(format, part->fieldno, part->path,
+				       part->path_len, path_pool);
+	if (field == NULL)
+		return -1;
 	/*
-		* If a field is not present in the space format,
-		* inherit nullable action of the first key part
-		* referencing it.
-		*/
+	 * If a field is not present in the space format, inherit
+	 * nullable action of the first key part referencing it.
+	 */
 	if (part->fieldno >= field_count && !field->is_key_part)
 		field->nullable_action = part->nullable_action;
 	/*
@@ -224,7 +350,8 @@ tuple_format_use_key_part(struct tuple_format *format, uint32_t field_count,
 	 * simply accessible, so we don't store an offset for it.
 	 */
 	if (field->offset_slot == TUPLE_OFFSET_SLOT_NIL &&
-	    is_sequential == false && part->fieldno > 0) {
+	    is_sequential == false &&
+	    (part->fieldno > 0 || part->path != NULL)) {
 		*current_slot = *current_slot - 1;
 		field->offset_slot = *current_slot;
 	}
@@ -269,6 +396,11 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 
 	int current_slot = 0;
 
+	/*
+	 * Set pointer to reserved area in the format chunk
+	 * allocated with tuple_format_alloc call.
+	 */
+	char *path_pool = (char *)format + sizeof(struct tuple_format);
 	/* extract field type info */
 	for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
 		const struct key_def *key_def = keys[key_no];
@@ -279,7 +411,8 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 		for (; part < parts_end; part++) {
 			if (tuple_format_use_key_part(format, field_count, part,
 						      is_sequential,
-						      &current_slot) != 0)
+						      &current_slot,
+						      &path_pool) != 0)
 				return -1;
 		}
 	}
@@ -302,6 +435,7 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 			 "malloc", "required field bitmap");
 		return -1;
 	}
+	format->min_tuple_size = mp_sizeof_array(format->index_field_count);
 	struct tuple_field *field;
 	json_tree_foreach_entry_preorder(field, &format->fields.root,
 					 struct tuple_field, token) {
@@ -313,6 +447,44 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 		if (json_token_is_leaf(&field->token) &&
 		    !tuple_field_is_nullable(field))
 			bit_set(format->required_fields, field->id);
+
+		/*
+		 * Update format::min_tuple_size by field.
+		 * Skip fields not involved in index.
+		 */
+		if (!field->is_key_part)
+			continue;
+		if (field->token.type == JSON_TOKEN_NUM) {
+			/*
+			 * Account a gap between omitted array
+			 * items.
+			 */
+			struct json_token **neighbors =
+				field->token.parent->children;
+			for (int i = field->token.sibling_idx - 1; i >= 0; i--) {
+				if (neighbors[i] != NULL &&
+				    json_tree_entry(neighbors[i],
+						    struct tuple_field,
+						    token)->is_key_part)
+					break;
+				format->min_tuple_size += mp_sizeof_nil();
+			}
+		} else {
+			/* Account a key string for map member. */
+			assert(field->token.type == JSON_TOKEN_STR);
+			format->min_tuple_size +=
+				mp_sizeof_str(field->token.len);
+		}
+		int max_child_idx = field->token.max_child_idx;
+		if (json_token_is_leaf(&field->token)) {
+			format->min_tuple_size += mp_sizeof_nil();
+		} else if (field->type == FIELD_TYPE_ARRAY) {
+			format->min_tuple_size +=
+				mp_sizeof_array(max_child_idx + 1);
+		} else if (field->type == FIELD_TYPE_MAP) {
+			format->min_tuple_size +=
+				mp_sizeof_map(max_child_idx + 1);
+		}
 	}
 	format->hash = tuple_format_hash(format);
 	return 0;
@@ -389,6 +561,8 @@ static struct tuple_format *
 tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 		   uint32_t space_field_count, struct tuple_dictionary *dict)
 {
+	/* Size of area to store JSON paths data. */
+	uint32_t path_pool_size = 0;
 	uint32_t index_field_count = 0;
 	/* find max max field no */
 	for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
@@ -398,13 +572,15 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 		for (; part < pend; part++) {
 			index_field_count = MAX(index_field_count,
 						part->fieldno + 1);
+			path_pool_size += part->path_len;
 		}
 	}
 	uint32_t field_count = MAX(space_field_count, index_field_count);
 
-	struct tuple_format *format = malloc(sizeof(struct tuple_format));
+	uint32_t allocation_size = sizeof(struct tuple_format) + path_pool_size;
+	struct tuple_format *format = malloc(allocation_size);
 	if (format == NULL) {
-		diag_set(OutOfMemory, sizeof(struct tuple_format), "malloc",
+		diag_set(OutOfMemory, allocation_size, "malloc",
 			 "tuple format");
 		return NULL;
 	}
@@ -440,6 +616,8 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 	}
 	format->total_field_count = field_count;
 	format->required_fields = NULL;
+	format->fields_depth = 1;
+	format->min_tuple_size = 0;
 	format->refs = 0;
 	format->id = FORMAT_ID_NIL;
 	format->index_field_count = index_field_count;
@@ -581,15 +759,16 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1,
 {
 	if (format1->exact_field_count != format2->exact_field_count)
 		return false;
-	uint32_t format1_field_count = tuple_format_field_count(format1);
-	uint32_t format2_field_count = tuple_format_field_count(format2);
-	for (uint32_t i = 0; i < format1_field_count; ++i) {
-		struct tuple_field *field1 = tuple_format_field(format1, i);
+	struct tuple_field *field1;
+	json_tree_foreach_entry_preorder(field1, &format1->fields.root,
+					 struct tuple_field, token) {
+		struct tuple_field *field2 =
+			tuple_format1_field_by_format2_field(format2, field1);
 		/*
 		 * The field has a data type in format1, but has
 		 * no data type in format2.
 		 */
-		if (i >= format2_field_count) {
+		if (field2 == NULL) {
 			/*
 			 * The field can get a name added
 			 * for it, and this doesn't require a data
@@ -605,7 +784,6 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1,
 			else
 				return false;
 		}
-		struct tuple_field *field2 = tuple_format_field(format2, i);
 		if (! field_type1_contains_type2(field1->type, field2->type))
 			return false;
 		/*
@@ -663,52 +841,122 @@ tuple_init_field_map(struct tuple_format *format, uint32_t *field_map,
 	 */
 	if (field_count == 0) {
 		/* Empty tuple, nothing to do. */
-		goto skip;
-	}
-	/* first field is simply accessible, so we do not store offset to it */
-	struct tuple_field *field = tuple_format_field(format, 0);
-	if (validate &&
-	    !field_mp_type_is_compatible(field->type, mp_typeof(*pos),
-					 tuple_field_is_nullable(field))) {
-		diag_set(ClientError, ER_FIELD_TYPE, tuple_field_path(field),
-			 field_type_strs[field->type]);
-		goto error;
+		goto finish;
 	}
-	if (required_fields != NULL)
-		bit_clear(required_fields, field->id);
-	mp_next(&pos);
-	/* other fields...*/
-	uint32_t i = 1;
 	uint32_t defined_field_count = MIN(field_count, validate ?
 					   tuple_format_field_count(format) :
 					   format->index_field_count);
-	if (field_count < format->index_field_count) {
+	/*
+	 * Nullify field map to be able to detect by 0,
+	 * which key fields are absent in tuple_field().
+	 */
+	memset((char *)field_map - format->field_map_size, 0,
+		format->field_map_size);
+	/*
+	 * Prepare mp stack of the size equal to the maximum depth
+	 * of the indexed field in the format::fields tree
+	 * (fields_depth) to carry out a simultaneous parsing of
+	 * the tuple and tree traversal to process type
+	 * validations and field map initialization.
+	 */
+	uint32_t frames_sz = format->fields_depth * sizeof(struct mp_frame);
+	struct mp_frame *frames = region_alloc(region, frames_sz);
+	if (frames == NULL) {
+		diag_set(OutOfMemory, frames_sz, "region", "frames");
+		goto error;
+	}
+	struct mp_stack stack;
+	mp_stack_create(&stack, format->fields_depth, frames);
+	mp_stack_push(&stack, MP_ARRAY, defined_field_count);
+	struct tuple_field *field;
+	struct json_token *parent = &format->fields.root;
+	while (true) {
+		int idx;
+		while ((idx = mp_stack_advance(&stack)) == -1) {
+			/*
+			 * If the elements of the current frame
+			 * are over, pop this frame out of stack
+			 * and climb one position in the
+			 * format::fields tree to match the
+			 * changed JSON path to the data in the
+			 * tuple.
+			 */
+			mp_stack_pop(&stack);
+			if (mp_stack_is_empty(&stack))
+				goto finish;
+			parent = parent->parent;
+		}
 		/*
-		 * Nullify field map to be able to detect by 0,
-		 * which key fields are absent in tuple_field().
+		 * Use the top frame of the stack and the
+		 * current data offset to prepare the JSON token
+		 * for the subsequent format::fields lookup.
 		 */
-		memset((char *)field_map - format->field_map_size, 0,
-		       format->field_map_size);
-	}
-	for (; i < defined_field_count; ++i) {
-		field = tuple_format_field(format, i);
-		if (validate &&
-		    !field_mp_type_is_compatible(field->type, mp_typeof(*pos),
-						 tuple_field_is_nullable(field))) {
-			diag_set(ClientError, ER_FIELD_TYPE,
-				 tuple_field_path(field),
-				 field_type_strs[field->type]);
-			goto error;
+		struct json_token token;
+		switch (mp_stack_type(&stack)) {
+		case MP_ARRAY:
+			token.type = JSON_TOKEN_NUM;
+			token.num = idx;
+			break;
+		case MP_MAP:
+			if (mp_typeof(*pos) != MP_STR) {
+				/*
+				 * JSON path support only string
+				 * keys for map. Skip this entry.
+				 */
+				mp_next(&pos);
+				mp_next(&pos);
+				continue;
+			}
+			token.type = JSON_TOKEN_STR;
+			token.str = mp_decode_str(&pos, (uint32_t *)&token.len);
+			break;
+		default:
+			unreachable();
 		}
-		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
-			field_map[field->offset_slot] =
-				(uint32_t) (pos - tuple);
+		/*
+		 * Perform lookup for a field in format::fields,
+		 * that represents the field metadata by JSON path
+		 * corresponding to the current position in the
+		 * tuple.
+		 */
+		enum mp_type type = mp_typeof(*pos);
+		assert(parent != NULL);
+		field = json_tree_lookup_entry(&format->fields, parent, &token,
+					       struct tuple_field, token);
+		if (field != NULL) {
+			bool is_nullable = tuple_field_is_nullable(field);
+			if (validate &&
+			    !field_mp_type_is_compatible(field->type, type,
+							 is_nullable) != 0) {
+				diag_set(ClientError, ER_FIELD_TYPE,
+					 tuple_field_path(field),
+					 field_type_strs[field->type]);
+				goto error;
+			}
+			if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+				field_map[field->offset_slot] = pos - tuple;
+			if (required_fields != NULL)
+				bit_clear(required_fields, field->id);
+		}
+		/*
+		 * If the current position of the data in tuple
+		 * matches the container type (MP_MAP or MP_ARRAY)
+		 * and the format::fields tree has such a record,
+		 * prepare a new stack frame because it needs to
+		 * be analyzed in the next iterations.
+		 */
+		if ((type == MP_ARRAY || type == MP_MAP) &&
+		    !mp_stack_is_full(&stack) && field != NULL) {
+			uint32_t size = type == MP_ARRAY ?
+					mp_decode_array(&pos) :
+					mp_decode_map(&pos);
+			mp_stack_push(&stack, type, size);
+			parent = &field->token;
+		} else {
+			mp_next(&pos);
 		}
-		if (required_fields != NULL)
-			bit_clear(required_fields, field->id);
-		mp_next(&pos);
 	}
-skip:
+finish:
 	/*
 	 * Check the required field bitmap for missing fields.
 	 */
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index 60b019194..d4b53195b 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -196,6 +196,15 @@ struct tuple_format {
 	 * Shared names storage used by all formats of a space.
 	 */
 	struct tuple_dictionary *dict;
+	/**
+	 * The size of a minimal tuple conforming to the format
+	 * and filled with nils.
+	 */
+	uint32_t min_tuple_size;
+	/**
+	 * A maximum depth of format::fields subtree.
+	 */
+	uint32_t fields_depth;
 	/**
 	 * Fields comprising the format, organized in a tree.
 	 * First level nodes correspond to tuple fields.
@@ -218,18 +227,36 @@ tuple_format_field_count(struct tuple_format *format)
 }
 
 /**
- * Return meta information of a top-level tuple field given
- * a format and a field index.
+ * Return meta information of a tuple field given a format,
+ * field index and path.
  */
 static inline struct tuple_field *
-tuple_format_field(struct tuple_format *format, uint32_t fieldno)
+tuple_format_field_by_path(struct tuple_format *format, uint32_t fieldno,
+			   const char *path, uint32_t path_len)
 {
 	assert(fieldno < tuple_format_field_count(format));
 	struct json_token token;
 	token.type = JSON_TOKEN_NUM;
 	token.num = fieldno;
-	return json_tree_lookup_entry(&format->fields, &format->fields.root,
-				      &token, struct tuple_field, token);
+	struct tuple_field *root =
+		json_tree_lookup_entry(&format->fields, &format->fields.root,
+				       &token, struct tuple_field, token);
+	assert(root != NULL);
+	if (path == NULL)
+		return root;
+	return json_tree_lookup_path_entry(&format->fields, &root->token,
+					   path, path_len, TUPLE_INDEX_BASE,
+					   struct tuple_field, token);
+}
+
+/**
+ * Return meta information of a top-level tuple field given
+ * a format and a field index.
+ */
+static inline struct tuple_field *
+tuple_format_field(struct tuple_format *format, uint32_t fieldno)
+{
+	return tuple_format_field_by_path(format, fieldno, NULL, 0);
 }
 
 extern struct tuple_format **tuple_formats;
@@ -401,12 +428,16 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			const char *path, uint32_t path_len)
 {
 	if (likely(fieldno < format->index_field_count)) {
-		if (fieldno == 0) {
+		if (path == NULL && fieldno == 0) {
 			mp_decode_array(&tuple);
-			goto parse_path;
+			return tuple;
 		}
-		struct tuple_field *field = tuple_format_field(format, fieldno);
-		assert(field != NULL);
+		struct tuple_field *field =
+			tuple_format_field_by_path(format, fieldno, path,
+						   path_len);
+		assert(field != NULL || path != NULL);
+		if (path != NULL && field == NULL)
+			goto parse;
 		int32_t offset_slot = field->offset_slot;
 		if (offset_slot == TUPLE_OFFSET_SLOT_NIL)
 			goto parse;
@@ -423,11 +454,11 @@ parse:
 			return NULL;
 		for (uint32_t k = 0; k < fieldno; k++)
 			mp_next(&tuple);
+		if (path != NULL &&
+		    unlikely(tuple_field_go_to_path(&tuple, path,
+						    path_len) != 0))
+			return NULL;
 	}
-parse_path:
-	if (path != NULL &&
-	    unlikely(tuple_field_go_to_path(&tuple, path, path_len) != 0))
-		return NULL;
 	return tuple;
 }
 
@@ -482,7 +513,8 @@ static inline const char *
 tuple_field_by_part_raw(struct tuple_format *format, const char *data,
 			const uint32_t *field_map, struct key_part *part)
 {
-	return tuple_field_raw(format, data, field_map, part->fieldno);
+	return tuple_field_raw_by_path(format, data, field_map, part->fieldno,
+				       part->path, part->path_len);
 }
 
 /**
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index 078cc6fe0..825c3e5b3 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -223,7 +223,7 @@ key_hash_slowpath(const char *key, struct key_def *key_def);
 
 void
 tuple_hash_func_set(struct key_def *key_def) {
-	if (key_def->is_nullable)
+	if (key_def->is_nullable || key_def->has_json_paths)
 		goto slowpath;
 	/*
 	 * Check that key_def defines sequential a key without holes
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 1832a29c7..065a309f5 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -1005,6 +1005,10 @@ vinyl_index_def_change_requires_rebuild(struct index *index,
 			return true;
 		if (!field_type1_contains_type2(new_part->type, old_part->type))
 			return true;
+		if (json_path_cmp(old_part->path, old_part->path_len,
+				  new_part->path, new_part->path_len,
+				  TUPLE_INDEX_BASE) != 0)
+			return true;
 	}
 	return false;
 }
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index 11c763cec..f94b60ff2 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -581,8 +581,9 @@ vy_log_record_decode(struct vy_log_record *record,
 			record->group_id = mp_decode_uint(&pos);
 			break;
 		case VY_LOG_KEY_DEF: {
+			struct region *region = &fiber()->gc;
 			uint32_t part_count = mp_decode_array(&pos);
-			struct key_part_def *parts = region_alloc(&fiber()->gc,
+			struct key_part_def *parts = region_alloc(region,
 						sizeof(*parts) * part_count);
 			if (parts == NULL) {
 				diag_set(OutOfMemory,
@@ -591,7 +592,7 @@ vy_log_record_decode(struct vy_log_record *record,
 				return -1;
 			}
 			if (key_def_decode_parts(parts, part_count, &pos,
-						 NULL, 0) != 0) {
+						 NULL, 0, region) != 0) {
 				diag_log();
 				diag_set(ClientError, ER_INVALID_VYLOG_FILE,
 					 "Bad record: failed to decode "
@@ -700,7 +701,8 @@ vy_log_record_dup(struct region *pool, const struct vy_log_record *src)
 				 "struct key_part_def");
 			goto err;
 		}
-		key_def_dump_parts(src->key_def, dst->key_parts);
+		if (key_def_dump_parts(src->key_def, dst->key_parts, pool) != 0)
+			goto err;
 		dst->key_part_count = src->key_def->part_count;
 		dst->key_def = NULL;
 	}
@@ -1267,6 +1269,46 @@ vy_recovery_lookup_slice(struct vy_recovery *recovery, int64_t slice_id)
 	return mh_i64ptr_node(h, k)->val;
 }
 
+/**
+ * Allocate duplicate of the data of key_part_count
+ * key_part_def objects. This function is required because the
+ * original key_part passed as an argument can have non-NULL
+ * path fields referencing other memory fragments.
+ *
+ * Returns the key_part_def on success, NULL on error.
+ */
+struct key_part_def *
+vy_recovery_alloc_key_parts(const struct key_part_def *key_parts,
+			    uint32_t key_part_count)
+{
+	uint32_t new_parts_sz = sizeof(*key_parts) * key_part_count;
+	for (uint32_t i = 0; i < key_part_count; i++) {
+		new_parts_sz += key_parts[i].path != NULL ?
+				strlen(key_parts[i].path) + 1 : 0;
+	}
+	struct key_part_def *new_parts = malloc(new_parts_sz);
+	if (new_parts == NULL) {
+		diag_set(OutOfMemory, sizeof(*key_parts) * key_part_count,
+			 "malloc", "struct key_part_def");
+		return NULL;
+	}
+	memcpy(new_parts, key_parts, sizeof(*key_parts) * key_part_count);
+	char *path_pool =
+		(char *)new_parts + sizeof(*key_parts) * key_part_count;
+	for (uint32_t i = 0; i < key_part_count; i++) {
+		if (key_parts[i].path == NULL)
+			continue;
+		char *path = path_pool;
+		uint32_t path_len = strlen(key_parts[i].path);
+		path_pool += path_len + 1;
+		memcpy(path, key_parts[i].path, path_len);
+		path[path_len] = '\0';
+		new_parts[i].path = path;
+	}
+	assert(path_pool == (char *)new_parts + new_parts_sz);
+	return new_parts;
+}
+
 /**
  * Allocate a new LSM tree with the given ID and add it to
  * the recovery context.
@@ -1292,10 +1334,8 @@ vy_recovery_do_create_lsm(struct vy_recovery *recovery, int64_t id,
 			 "malloc", "struct vy_lsm_recovery_info");
 		return NULL;
 	}
-	lsm->key_parts = malloc(sizeof(*key_parts) * key_part_count);
+	lsm->key_parts = vy_recovery_alloc_key_parts(key_parts, key_part_count);
 	if (lsm->key_parts == NULL) {
-		diag_set(OutOfMemory, sizeof(*key_parts) * key_part_count,
-			 "malloc", "struct key_part_def");
 		free(lsm);
 		return NULL;
 	}
@@ -1313,7 +1353,6 @@ vy_recovery_do_create_lsm(struct vy_recovery *recovery, int64_t id,
 	lsm->space_id = space_id;
 	lsm->index_id = index_id;
 	lsm->group_id = group_id;
-	memcpy(lsm->key_parts, key_parts, sizeof(*key_parts) * key_part_count);
 	lsm->key_part_count = key_part_count;
 	lsm->create_lsn = -1;
 	lsm->modify_lsn = -1;
@@ -1440,13 +1479,9 @@ vy_recovery_modify_lsm(struct vy_recovery *recovery, int64_t id,
 		return -1;
 	}
 	free(lsm->key_parts);
-	lsm->key_parts = malloc(sizeof(*key_parts) * key_part_count);
-	if (lsm->key_parts == NULL) {
-		diag_set(OutOfMemory, sizeof(*key_parts) * key_part_count,
-			 "malloc", "struct key_part_def");
+	lsm->key_parts = vy_recovery_alloc_key_parts(key_parts, key_part_count);
+	if (lsm->key_parts == NULL)
 		return -1;
-	}
-	memcpy(lsm->key_parts, key_parts, sizeof(*key_parts) * key_part_count);
 	lsm->key_part_count = key_part_count;
 	lsm->modify_lsn = modify_lsn;
 	return 0;
diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c
index ddbc2d46f..088177a4b 100644
--- a/src/box/vy_point_lookup.c
+++ b/src/box/vy_point_lookup.c
@@ -196,7 +196,9 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx,
 		const struct vy_read_view **rv,
 		struct tuple *key, struct tuple **ret)
 {
-	assert(tuple_field_count(key) >= lsm->cmp_def->part_count);
+	/* All key parts must be set for a point lookup. */
+	assert(vy_stmt_type(key) != IPROTO_SELECT ||
+	       tuple_field_count(key) >= lsm->cmp_def->part_count);
 
 	*ret = NULL;
 	double start_time = ev_monotonic_now(loop());
diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c
index 47f135c65..af5c64086 100644
--- a/src/box/vy_stmt.c
+++ b/src/box/vy_stmt.c
@@ -383,51 +383,103 @@ vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
 	/* UPSERT can't be surrogate. */
 	assert(type != IPROTO_UPSERT);
 	struct region *region = &fiber()->gc;
+	size_t region_svp = region_used(region);
 
 	uint32_t field_count = format->index_field_count;
-	struct iovec *iov = region_alloc(region, sizeof(*iov) * field_count);
+	uint32_t iov_sz = sizeof(struct iovec) * format->total_field_count;
+	struct iovec *iov = region_alloc(region, iov_sz);
 	if (iov == NULL) {
-		diag_set(OutOfMemory, sizeof(*iov) * field_count,
-			 "region", "iov for surrogate key");
+		diag_set(OutOfMemory, iov_sz, "region",
+			 "iov for surrogate key");
 		return NULL;
 	}
-	memset(iov, 0, sizeof(*iov) * field_count);
+	memset(iov, 0, iov_sz);
 	uint32_t part_count = mp_decode_array(&key);
 	assert(part_count == cmp_def->part_count);
-	assert(part_count <= field_count);
-	uint32_t nulls_count = field_count - cmp_def->part_count;
-	uint32_t bsize = mp_sizeof_array(field_count) +
-			 mp_sizeof_nil() * nulls_count;
+	assert(part_count <= format->total_field_count);
+	/**
+	 * Calculate bsize using format::min_tuple_size tuple
+	 * where parts_count nulls replaced with extracted keys.
+	 */
+	uint32_t bsize = format->min_tuple_size - mp_sizeof_nil() * part_count;
 	for (uint32_t i = 0; i < part_count; ++i) {
 		const struct key_part *part = &cmp_def->parts[i];
 		assert(part->fieldno < field_count);
+		struct tuple_field *field =
+			tuple_format_field_by_path(format, part->fieldno,
+						   part->path, part->path_len);
+		assert(field != NULL);
 		const char *svp = key;
-		iov[part->fieldno].iov_base = (char *) key;
+		iov[field->id].iov_base = (char *) key;
 		mp_next(&key);
-		iov[part->fieldno].iov_len = key - svp;
+		iov[field->id].iov_len = key - svp;
 		bsize += key - svp;
 	}
 
 	struct tuple *stmt = vy_stmt_alloc(format, bsize);
 	if (stmt == NULL)
-		return NULL;
+		goto out;
 
 	char *raw = (char *) tuple_data(stmt);
 	uint32_t *field_map = (uint32_t *) raw;
+	memset((char *)field_map - format->field_map_size, 0,
+	       format->field_map_size);
 	char *wpos = mp_encode_array(raw, field_count);
-	for (uint32_t i = 0; i < field_count; ++i) {
-		struct tuple_field *field = tuple_format_field(format, i);
-		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
-			field_map[field->offset_slot] = wpos - raw;
-		if (iov[i].iov_base == NULL) {
-			wpos = mp_encode_nil(wpos);
+	struct tuple_field *field;
+	json_tree_foreach_entry_preorder(field, &format->fields.root,
+					 struct tuple_field, token) {
+		/*
+		 * Do not restore fields not involved in index
+		 * (except gaps in the mp array that may be filled
+		 * with nils later).
+		 */
+		if (!field->is_key_part)
+			continue;
+		if (field->token.type == JSON_TOKEN_NUM) {
+			/*
+			 * Write nil istead of omitted array
+			 * members.
+			 */
+			struct json_token **neighbors =
+				field->token.parent->children;
+			for (int i = field->token.sibling_idx - 1; i >= 0; i--) {
+				if (neighbors[i] != NULL &&
+				    json_tree_entry(neighbors[i],
+						    struct tuple_field,
+						    token)->is_key_part)
+					break;
+				wpos = mp_encode_nil(wpos);
+			}
 		} else {
-			memcpy(wpos, iov[i].iov_base, iov[i].iov_len);
-			wpos += iov[i].iov_len;
+			/* Write a key string for map member. */
+			assert(field->token.type == JSON_TOKEN_STR);
+			const char *str = field->token.str;
+			uint32_t len = field->token.len;
+			wpos = mp_encode_str(wpos, str, len);
+		}
+		int max_child_idx = field->token.max_child_idx;
+		if (json_token_is_leaf(&field->token)) {
+			if (iov[field->id].iov_len == 0) {
+				wpos = mp_encode_nil(wpos);
+			} else {
+				memcpy(wpos, iov[field->id].iov_base,
+				       iov[field->id].iov_len);
+				uint32_t data_offset = wpos - raw;
+				int32_t slot = field->offset_slot;
+				if (slot != TUPLE_OFFSET_SLOT_NIL)
+					field_map[slot] = data_offset;
+				wpos += iov[field->id].iov_len;
+			}
+		} else if (field->type == FIELD_TYPE_ARRAY) {
+			wpos = mp_encode_array(wpos, max_child_idx + 1);
+		} else if (field->type == FIELD_TYPE_MAP) {
+			wpos = mp_encode_map(wpos, max_child_idx + 1);
 		}
 	}
 	assert(wpos == raw + bsize);
 	vy_stmt_set_type(stmt, type);
+out:
+	region_truncate(region, region_svp);
 	return stmt;
 }
 
@@ -443,10 +495,13 @@ struct tuple *
 vy_stmt_new_surrogate_delete_raw(struct tuple_format *format,
 				 const char *src_data, const char *src_data_end)
 {
+	struct tuple *stmt = NULL;
 	uint32_t src_size = src_data_end - src_data;
 	uint32_t total_size = src_size + format->field_map_size;
 	/* Surrogate tuple uses less memory than the original tuple */
-	char *data = region_alloc(&fiber()->gc, total_size);
+	struct region *region = &fiber()->gc;
+	size_t region_svp = region_used(region);
+	char *data = region_alloc(region, total_size);
 	if (data == NULL) {
 		diag_set(OutOfMemory, src_size, "region", "tuple");
 		return NULL;
@@ -456,47 +511,102 @@ vy_stmt_new_surrogate_delete_raw(struct tuple_format *format,
 
 	const char *src_pos = src_data;
 	uint32_t src_count = mp_decode_array(&src_pos);
-	uint32_t field_count;
-	if (src_count < format->index_field_count) {
-		field_count = src_count;
-		/*
-		 * Nullify field map to be able to detect by 0,
-		 * which key fields are absent in tuple_field().
-		 */
-		memset((char *)field_map - format->field_map_size, 0,
-		       format->field_map_size);
-	} else {
-		field_count = format->index_field_count;
-	}
+	uint32_t field_count = MIN(src_count, format->index_field_count);
+	/*
+	 * Nullify field map to be able to detect by 0, which key
+	 * fields are absent in tuple_field().
+	 */
+	memset((char *)field_map - format->field_map_size, 0,
+		format->field_map_size);
 	char *pos = mp_encode_array(data, field_count);
-	for (uint32_t i = 0; i < field_count; ++i) {
-		struct tuple_field *field = tuple_format_field(format, i);
-		if (! field->is_key_part) {
-			/* Unindexed field - write NIL. */
-			assert(i < src_count);
-			pos = mp_encode_nil(pos);
+	/*
+	 * Perform simultaneous parsing of the tuple and
+	 * format::fields tree traversal to copy indexed field
+	 * data and initialize field map. In many details the code
+	 * above works like tuple_init_field_map, read it's
+	 * comments for more details.
+	 */
+	uint32_t frames_sz = format->fields_depth * sizeof(struct mp_frame);
+	struct mp_frame *frames = region_alloc(region, frames_sz);
+	if (frames == NULL) {
+		diag_set(OutOfMemory, frames_sz, "region", "frames");
+		goto out;
+	}
+	struct mp_stack stack;
+	mp_stack_create(&stack, format->fields_depth, frames);
+	mp_stack_push(&stack, MP_ARRAY, field_count);
+	struct tuple_field *field;
+	struct json_token *parent = &format->fields.root;
+	while (true) {
+		int idx;
+		while ((idx = mp_stack_advance(&stack)) == -1) {
+			mp_stack_pop(&stack);
+			if (mp_stack_is_empty(&stack))
+				goto finish;
+			parent = parent->parent;
+		}
+		struct json_token token;
+		switch (mp_stack_type(&stack)) {
+		case MP_ARRAY:
+			token.type = JSON_TOKEN_NUM;
+			token.num = idx;
+			break;
+		case MP_MAP:
+			if (mp_typeof(*src_pos) != MP_STR) {
+				mp_next(&src_pos);
+				mp_next(&src_pos);
+				continue;
+			}
+			token.type = JSON_TOKEN_STR;
+			token.str = mp_decode_str(&src_pos, (uint32_t *)&token.len);
+			pos = mp_encode_str(pos, token.str, token.len);
+			break;
+		default:
+			unreachable();
+		}
+		assert(parent != NULL);
+		field = json_tree_lookup_entry(&format->fields, parent, &token,
+					       struct tuple_field, token);
+		if (field == NULL || !field->is_key_part) {
 			mp_next(&src_pos);
+			pos = mp_encode_nil(pos);
 			continue;
 		}
-		/* Indexed field - copy */
-		const char *src_field = src_pos;
-		mp_next(&src_pos);
-		memcpy(pos, src_field, src_pos - src_field);
 		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
 			field_map[field->offset_slot] = pos - data;
-		pos += src_pos - src_field;
+		enum mp_type type = mp_typeof(*src_pos);
+		if ((type == MP_ARRAY || type == MP_MAP) &&
+		    !mp_stack_is_full(&stack)) {
+			uint32_t size;
+			if (type == MP_ARRAY) {
+				size = mp_decode_array(&src_pos);
+				pos = mp_encode_array(pos, size);
+			} else {
+				size = mp_decode_map(&src_pos);
+				pos = mp_encode_map(pos, size);
+			}
+			mp_stack_push(&stack, type, size);
+			parent = &field->token;
+		} else {
+			const char *src_field = src_pos;
+			mp_next(&src_pos);
+			memcpy(pos, src_field, src_pos - src_field);
+			pos += src_pos - src_field;
+		}
 	}
+finish:
 	assert(pos <= data + src_size);
 	uint32_t bsize = pos - data;
-	struct tuple *stmt = vy_stmt_alloc(format, bsize);
+	stmt = vy_stmt_alloc(format, bsize);
 	if (stmt == NULL)
-		return NULL;
+		goto out;
 	char *stmt_data = (char *) tuple_data(stmt);
 	char *stmt_field_map_begin = stmt_data - format->field_map_size;
 	memcpy(stmt_data, data, bsize);
 	memcpy(stmt_field_map_begin, field_map_begin, format->field_map_size);
 	vy_stmt_set_type(stmt, IPROTO_DELETE);
-
+out:
+	region_truncate(region, region_svp);
 	return stmt;
 }
 
diff --git a/test/engine/json.result b/test/engine/json.result
new file mode 100644
index 000000000..3a5f472bc
--- /dev/null
+++ b/test/engine/json.result
@@ -0,0 +1,591 @@
+test_run = require('test_run').new()
+---
+...
+engine = test_run:get_cfg('engine')
+---
+...
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+-- Test build field tree conflicts.
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO["fname"]'}, {3, 'str', path = '["FIO"].fname'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': same key
+    part is indexed twice'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '["FIO"]["fname"]'}}})
+---
+- error: 'Wrong index options (field 2): ''path'' must be string'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = 'FIO'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+    ''map'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[1]'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+    ''array'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO'}, {3, 'str', path = 'FIO.fname'}}})
+---
+- error: Field [3]["FIO"] has type 'string' in one index, but type 'map' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[1].sname'}, {3, 'str', path = '["FIO"].fname'}}})
+---
+- error: Field 3 has type 'array' in one index, but type 'map' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO....fname'}}})
+---
+- error: 'Wrong index options (field 2): invalid path'
+...
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO.fname', is_nullable = false}, {3, 'str', path = '["FIO"]["sname"]'}}})
+---
+...
+idx ~= nil
+---
+- true
+...
+idx.parts[2].path == 'FIO.fname'
+---
+- true
+...
+-- Test format mismatch.
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'array'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+---
+...
+s:format(format)
+---
+- error: Field 3 has type 'array' in one index, but type 'map' in another
+...
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'map'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+---
+...
+s:format(format)
+---
+...
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: Field [3]["FIO"]["fname"] has type 'string' in one index, but type 'number'
+    in another
+...
+-- Test incompatable tuple insertion.
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+---
+- error: 'Tuple field [3]["FIO"] type does not match one required by operation: expected
+    map'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+---
+- error: 'Tuple field [3]["FIO"]["fname"] type does not match one required by operation:
+    expected string'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+---
+- error: Tuple field [3]["FIO"]["sname"] required by space format is missing
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+  4, 5]
+...
+idx:select()
+---
+- - [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+  - [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+    4, 5]
+...
+idx:min()
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:max()
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+  4, 5]
+...
+s:drop()
+---
+...
+-- Test upsert of JSON-indexed data.
+s = box.schema.create_space('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[2]'}
+---
+...
+pk = s:create_index('pk', {parts = parts})
+---
+...
+s:insert{{1, 2}, 3}
+---
+- [[1, 2], 3]
+...
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+---
+...
+s:get(2)
+---
+- [[1, 2], 8]
+...
+s:drop()
+---
+...
+-- Test index creation on space with data.
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk = s:create_index('primary', { type = 'tree', parts = {{2, 'number'}} })
+---
+...
+s:insert{1, 1, 7, {town = 'London', FIO = 1234}, 4, 5}
+---
+- [1, 1, 7, {'town': 'London', 'FIO': 1234}, 4, 5]
+...
+s:insert{2, 2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [2, 2, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{3, 3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [3, 3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{4, 4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+---
+- [4, 4, 7, {'town': 'London', 'FIO': [1, 2, 3]}, 4, 5]
+...
+s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}, {4, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: 'Tuple field [4]["FIO"] type does not match one required by operation: expected
+    map'
+...
+_ = s:delete(1)
+---
+...
+s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}, {4, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+_ = s:delete(2)
+---
+...
+s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}, {4, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: 'Tuple field [4]["FIO"] type does not match one required by operation: expected
+    map'
+...
+_ = s:delete(4)
+---
+...
+idx = s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]', is_nullable = true}, {4, 'str', path = '["FIO"]["sname"]'}, {4, 'str', path = '["FIO"]["extra"]', is_nullable = true}}})
+---
+...
+idx ~= nil
+---
+- true
+...
+s:create_index('test2', {parts = {{3, 'number'}, {4, 'number', path = '["FIO"]["fname"]'}}})
+---
+- error: Field [4]["FIO"]["fname"] has type 'string' in one index, but type 'number'
+    in another
+...
+idx2 = s:create_index('test2', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}}})
+---
+...
+idx2 ~= nil
+---
+- true
+...
+t = s:insert{5, 5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+---
+...
+idx:select()
+---
+- - [5, 5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4,
+    5]
+  - [3, 3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:min()
+---
+- [5, 5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4, 5]
+...
+idx:max()
+---
+- [3, 3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:drop()
+---
+...
+s:drop()
+---
+...
+-- Test complex JSON indexes with nullable fields.
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'str', path='[3][2].a'}
+---
+...
+parts[2] = {1, 'unsigned', path = '[3][1]'}
+---
+...
+parts[3] = {2, 'str', path = '[2].d[1]'}
+---
+...
+pk = s:create_index('primary', { type = 'tree', parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+---
+- [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6,
+  [1, 2, 3]]
+...
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+---
+- error: Duplicate key exists in unique index 'primary' in space 'withdata'
+...
+parts = {}
+---
+...
+parts[1] = {4, 'unsigned', path='[1]', is_nullable = false}
+---
+...
+parts[2] = {4, 'unsigned', path='[2]', is_nullable = true}
+---
+...
+parts[3] = {4, 'unsigned', path='[4]', is_nullable = true}
+---
+...
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+---
+- error: Tuple field [4][1] required by space format is missing
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[3][2].b' }
+---
+...
+parts[2] = {3, 'unsigned'}
+---
+...
+crosspart_idx = s:create_index('crosspart', { parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[3][2].b'}
+---
+...
+num_idx = s:create_index('numeric', {parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:get(2)
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+num_idx:select()
+---
+- - [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+      9, 2, 3]]
+  - [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}],
+    6, [1, 2, 3]]
+  - [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+      0]]
+...
+num_idx:max()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:min()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+crosspart_idx:max() == num_idx:max()
+---
+- true
+...
+crosspart_idx:min() == num_idx:min()
+---
+- true
+...
+trap_idx:max()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+trap_idx:min()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+s:drop()
+---
+...
+-- Test index alter.
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk_simplified = s:create_index('primary', { type = 'tree',  parts = {{1, 'unsigned'}}})
+---
+...
+pk_simplified.path == box.NULL
+---
+- true
+...
+idx = s:create_index('idx', {parts = {{2, 'integer', path = 'a'}}})
+---
+...
+s:insert{31, {a = 1, aa = -1}}
+---
+- [31, {'a': 1, 'aa': -1}]
+...
+s:insert{22, {a = 2, aa = -2}}
+---
+- [22, {'a': 2, 'aa': -2}]
+...
+s:insert{13, {a = 3, aa = -3}}
+---
+- [13, {'a': 3, 'aa': -3}]
+...
+idx:select()
+---
+- - [31, {'a': 1, 'aa': -1}]
+  - [22, {'a': 2, 'aa': -2}]
+  - [13, {'a': 3, 'aa': -3}]
+...
+idx:alter({parts = {{2, 'integer', path = 'aa'}}})
+---
+...
+idx:select()
+---
+- - [13, {'a': 3, 'aa': -3}]
+  - [22, {'a': 2, 'aa': -2}]
+  - [31, {'a': 1, 'aa': -1}]
+...
+s:drop()
+---
+...
+-- Incompatible format change.
+s = box.schema.space.create('withdata')
+---
+...
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1]'}}})
+---
+...
+s:insert{{-1}}
+---
+- [[-1]]
+...
+i:alter{parts = {{1, 'string', path = '[1]'}}}
+---
+- error: 'Tuple field [1][1] type does not match one required by operation: expected
+    string'
+...
+s:insert{{'a'}}
+---
+- error: 'Tuple field [1][1] type does not match one required by operation: expected
+    integer'
+...
+i:drop()
+---
+...
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1].FIO'}}})
+---
+...
+s:insert{{{FIO=-1}}}
+---
+- [[{'FIO': -1}]]
+...
+i:alter{parts = {{1, 'integer', path = '[1][1]'}}}
+---
+- error: 'Tuple field [1][1] type does not match one required by operation: expected
+    array'
+...
+i:alter{parts = {{1, 'integer', path = '[1].FIO[1]'}}}
+---
+- error: 'Tuple field [1][1]["FIO"] type does not match one required by operation:
+    expected array'
+...
+s:drop()
+---
+...
+-- Test snapshotting and recovery.
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk = s:create_index('pk', {parts = {{1, 'integer'}, {3, 'string', path = 'town'}}})
+---
+...
+name = s:create_index('name', {parts = {{3, 'string', path = 'FIO.fname'}, {3, 'string', path = 'FIO.sname'}, {3, 'string', path = 'FIO.extra', is_nullable = true}}})
+---
+...
+s:insert{1, 1, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev'}}}
+---
+- [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+...
+s:insert{1, 777, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}}
+---
+- [1, 777, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}]
+...
+s:insert{1, 45, {town = 'Berlin', FIO = {fname = 'Richard', sname = 'Sorge'}}}
+---
+- [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+...
+s:insert{4, 45, {town = 'Berlin', FIO = {fname = 'Max', extra = 'Otto', sname = 'Stierlitz'}}}
+---
+- [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'extra': 'Otto', 'sname': 'Stierlitz'}}]
+...
+pk:select({1})
+---
+- - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+  - [1, 777, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}]
+  - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+...
+pk:select({1, 'Berlin'})
+---
+- - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+...
+name:select({})
+---
+- - [1, 777, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}]
+  - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+  - [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'extra': 'Otto', 'sname': 'Stierlitz'}}]
+  - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+...
+name:select({'Max'})
+---
+- - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+  - [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'extra': 'Otto', 'sname': 'Stierlitz'}}]
+...
+name:get({'Max', 'Stierlitz', 'Otto'})
+---
+- [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'extra': 'Otto', 'sname': 'Stierlitz'}}]
+...
+box.snapshot()
+---
+- ok
+...
+test_run:cmd("restart server default")
+s = box.space["withdata"]
+---
+...
+pk = s.index["pk"]
+---
+...
+name = s.index["name"]
+---
+...
+pk:select({1})
+---
+- - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+  - [1, 777, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}]
+  - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+...
+pk:select({1, 'Berlin'})
+---
+- - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+...
+name:select({})
+---
+- - [1, 777, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}]
+  - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+  - [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'extra': 'Otto', 'sname': 'Stierlitz'}}]
+  - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+...
+name:select({'Max'})
+---
+- - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+  - [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'extra': 'Otto', 'sname': 'Stierlitz'}}]
+...
+name:get({'Max', 'Stierlitz', 'Otto'})
+---
+- [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'extra': 'Otto', 'sname': 'Stierlitz'}}]
+...
+s:replace{4, 45, {town = 'Berlin', FIO = {fname = 'Max', sname = 'Stierlitz'}}}
+---
+- [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'sname': 'Stierlitz'}}]
+...
+name:select({'Max', 'Stierlitz'})
+---
+- - [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'sname': 'Stierlitz'}}]
+...
+town = s:create_index('town', {unique = false, parts = {{3, 'string', path = 'town'}}})
+---
+...
+town:select({'Berlin'})
+---
+- - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+  - [4, 45, {'town': 'Berlin', 'FIO': {'fname': 'Max', 'sname': 'Stierlitz'}}]
+...
+_ = s:delete({4, 'Berlin'})
+---
+...
+town:select({'Berlin'})
+---
+- - [1, 45, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+...
+s:update({1, 'Berlin'}, {{"+", 2, 45}})
+---
+- [1, 90, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1, 90, {town = 'Berlin', FIO = {fname = 'X', sname = 'Y'}}}, {{'+', 2, 1}})
+---
+...
+town:select()
+---
+- - [1, 91, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+  - [1, 777, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}]
+  - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+...
+name:drop()
+---
+...
+town:select()
+---
+- - [1, 91, {'town': 'Berlin', 'FIO': {'fname': 'Richard', 'sname': 'Sorge'}}]
+  - [1, 777, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}]
+  - [1, 1, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'sname': 'Isaev'}}]
+...
+s:drop()
+---
+...
diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
new file mode 100644
index 000000000..181eae02c
--- /dev/null
+++ b/test/engine/json.test.lua
@@ -0,0 +1,167 @@
+test_run = require('test_run').new()
+engine = test_run:get_cfg('engine')
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+-- Test build field tree conflicts.
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO["fname"]'}, {3, 'str', path = '["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '["FIO"]["fname"]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = 'FIO'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[1]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO'}, {3, 'str', path = 'FIO.fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[1].sname'}, {3, 'str', path = '["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO....fname'}}})
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO.fname', is_nullable = false}, {3, 'str', path = '["FIO"]["sname"]'}}})
+idx ~= nil
+idx.parts[2].path == 'FIO.fname'
+-- Test format mismatch.
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'array'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+s:format(format)
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'map'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+s:format(format)
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+-- Test incompatable tuple insertion.
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+idx:select()
+idx:min()
+idx:max()
+s:drop()
+
+-- Test upsert of JSON-indexed data.
+s = box.schema.create_space('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'unsigned', path='[2]'}
+pk = s:create_index('pk', {parts = parts})
+s:insert{{1, 2}, 3}
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+s:get(2)
+s:drop()
+
+-- Test index creation on space with data.
+s = box.schema.space.create('withdata', {engine = engine})
+pk = s:create_index('primary', { type = 'tree', parts = {{2, 'number'}} })
+s:insert{1, 1, 7, {town = 'London', FIO = 1234}, 4, 5}
+s:insert{2, 2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{3, 3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{4, 4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}, {4, 'str', path = '["FIO"]["sname"]'}}})
+_ = s:delete(1)
+s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}, {4, 'str', path = '["FIO"]["sname"]'}}})
+_ = s:delete(2)
+s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}, {4, 'str', path = '["FIO"]["sname"]'}}})
+_ = s:delete(4)
+idx = s:create_index('test1', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]', is_nullable = true}, {4, 'str', path = '["FIO"]["sname"]'}, {4, 'str', path = '["FIO"]["extra"]', is_nullable = true}}})
+idx ~= nil
+s:create_index('test2', {parts = {{3, 'number'}, {4, 'number', path = '["FIO"]["fname"]'}}})
+idx2 = s:create_index('test2', {parts = {{3, 'number'}, {4, 'str', path = '["FIO"]["fname"]'}}})
+idx2 ~= nil
+t = s:insert{5, 5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+idx:select()
+idx:min()
+idx:max()
+idx:drop()
+s:drop()
+
+-- Test complex JSON indexes with nullable fields.
+s = box.schema.space.create('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'str', path='[3][2].a'}
+parts[2] = {1, 'unsigned', path = '[3][1]'}
+parts[3] = {2, 'str', path = '[2].d[1]'}
+pk = s:create_index('primary', { type = 'tree', parts =  parts})
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+parts = {}
+parts[1] = {4, 'unsigned', path='[1]', is_nullable = false}
+parts[2] = {4, 'unsigned', path='[2]', is_nullable = true}
+parts[3] = {4, 'unsigned', path='[4]', is_nullable = true}
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[3][2].b' }
+parts[2] = {3, 'unsigned'}
+crosspart_idx = s:create_index('crosspart', { parts =  parts})
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[3][2].b'}
+num_idx = s:create_index('numeric', {parts =  parts})
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+num_idx:get(2)
+num_idx:select()
+num_idx:max()
+num_idx:min()
+crosspart_idx:max() == num_idx:max()
+crosspart_idx:min() == num_idx:min()
+trap_idx:max()
+trap_idx:min()
+s:drop()
+
+-- Test index alter.
+s = box.schema.space.create('withdata', {engine = engine})
+pk_simplified = s:create_index('primary', { type = 'tree',  parts = {{1, 'unsigned'}}})
+pk_simplified.path == box.NULL
+idx = s:create_index('idx', {parts = {{2, 'integer', path = 'a'}}})
+s:insert{31, {a = 1, aa = -1}}
+s:insert{22, {a = 2, aa = -2}}
+s:insert{13, {a = 3, aa = -3}}
+idx:select()
+idx:alter({parts = {{2, 'integer', path = 'aa'}}})
+idx:select()
+s:drop()
+
+-- Incompatible format change.
+s = box.schema.space.create('withdata')
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1]'}}})
+s:insert{{-1}}
+i:alter{parts = {{1, 'string', path = '[1]'}}}
+s:insert{{'a'}}
+i:drop()
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1].FIO'}}})
+s:insert{{{FIO=-1}}}
+i:alter{parts = {{1, 'integer', path = '[1][1]'}}}
+i:alter{parts = {{1, 'integer', path = '[1].FIO[1]'}}}
+s:drop()
+
+-- Test snapshotting and recovery.
+s = box.schema.space.create('withdata', {engine = engine})
+pk = s:create_index('pk', {parts = {{1, 'integer'}, {3, 'string', path = 'town'}}})
+name = s:create_index('name', {parts = {{3, 'string', path = 'FIO.fname'}, {3, 'string', path = 'FIO.sname'}, {3, 'string', path = 'FIO.extra', is_nullable = true}}})
+s:insert{1, 1, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev'}}}
+s:insert{1, 777, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}}
+s:insert{1, 45, {town = 'Berlin', FIO = {fname = 'Richard', sname = 'Sorge'}}}
+s:insert{4, 45, {town = 'Berlin', FIO = {fname = 'Max', extra = 'Otto', sname = 'Stierlitz'}}}
+pk:select({1})
+pk:select({1, 'Berlin'})
+name:select({})
+name:select({'Max'})
+name:get({'Max', 'Stierlitz', 'Otto'})
+box.snapshot()
+test_run:cmd("restart server default")
+s = box.space["withdata"]
+pk = s.index["pk"]
+name = s.index["name"]
+pk:select({1})
+pk:select({1, 'Berlin'})
+name:select({})
+name:select({'Max'})
+name:get({'Max', 'Stierlitz', 'Otto'})
+s:replace{4, 45, {town = 'Berlin', FIO = {fname = 'Max', sname = 'Stierlitz'}}}
+name:select({'Max', 'Stierlitz'})
+town = s:create_index('town', {unique = false, parts = {{3, 'string', path = 'town'}}})
+town:select({'Berlin'})
+_ = s:delete({4, 'Berlin'})
+town:select({'Berlin'})
+s:update({1, 'Berlin'}, {{"+", 2, 45}})
+box.snapshot()
+s:upsert({1, 90, {town = 'Berlin', FIO = {fname = 'X', sname = 'Y'}}}, {{'+', 2, 1}})
+town:select()
+name:drop()
+town:select()
+s:drop()
-- 
2.20.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 3/6] box: introduce JSON Indexes
  2019-02-03 10:20 ` [PATCH v9 3/6] box: introduce JSON Indexes Kirill Shcherbatov
@ 2019-02-04 12:26   ` Vladimir Davydov
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Davydov @ 2019-02-04 12:26 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Sun, Feb 03, 2019 at 01:20:23PM +0300, Kirill Shcherbatov wrote:
> New JSON indexes allows to index documents content.
> At first, introduced new key_part fields path and path_len
> representing JSON path string specified by user. Modified
> tuple_format_use_key_part routine constructs corresponding
> tuple_fields chain in tuple_format::fields tree to indexed data.
> The resulting tree is used for type checking and for alloctating
> indexed fields offset slots.
> 
> Then refined tuple_init_field_map routine logic parses tuple
> msgpack in depth using stack allocated on region and initialize
> field map with corresponding tuple_format::field if any.
> Finally, to proceed memory allocation for vinyl's secondary key
> restored by extracted keys loaded from disc without fields
> tree traversal, introduced format::min_tuple_size field - the
> size of tuple_format tuple as if all leaf fields are zero.
> 
> Example:
> To create a new JSON index specify path to document data as a
> part of key_part:
> parts = {{3, 'str', path = '.FIO.fname', is_nullable = false}}
> idx = s:create_index('json_idx', {parts = parse})
> idx:select("Ivanov")
> 
> Part of #1012
> ---
>  src/box/alter.cc             |   2 +-
>  src/box/index_def.c          |  10 +-
>  src/box/key_def.c            | 157 ++++++++--
>  src/box/key_def.h            |  34 +-
>  src/box/lua/space.cc         |   5 +
>  src/box/memtx_engine.c       |   4 +
>  src/box/sql.c                |   1 +
>  src/box/sql/build.c          |   1 +
>  src/box/sql/select.c         |   3 +-
>  src/box/sql/where.c          |   1 +
>  src/box/tuple_compare.cc     |   7 +-
>  src/box/tuple_extract_key.cc |  35 ++-
>  src/box/tuple_format.c       | 374 ++++++++++++++++++----
>  src/box/tuple_format.h       |  60 +++-
>  src/box/tuple_hash.cc        |   2 +-
>  src/box/vinyl.c              |   4 +
>  src/box/vy_log.c             |  61 +++-
>  src/box/vy_point_lookup.c    |   4 +-
>  src/box/vy_stmt.c            | 202 +++++++++---
>  test/engine/json.result      | 591 +++++++++++++++++++++++++++++++++++
>  test/engine/json.test.lua    | 167 ++++++++++
>  21 files changed, 1546 insertions(+), 179 deletions(-)
>  create mode 100644 test/engine/json.result
>  create mode 100644 test/engine/json.test.lua

Pushed to 2.1.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 4/6] box: introduce has_json_paths flag in templates
  2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov
                   ` (2 preceding siblings ...)
  2019-02-03 10:20 ` [PATCH v9 3/6] box: introduce JSON Indexes Kirill Shcherbatov
@ 2019-02-03 10:20 ` Kirill Shcherbatov
  2019-02-04 12:31   ` Vladimir Davydov
  2019-02-03 10:20 ` [PATCH v9 5/6] box: introduce offset_slot cache in key_part Kirill Shcherbatov
  2019-02-03 10:20 ` [PATCH v9 6/6] box: specify indexes in user-friendly form Kirill Shcherbatov
  5 siblings, 1 reply; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-03 10:20 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

Introduced has_json_path flag for compare, hash and extract
functions templates(that are really hot) to make possible do not
look to path field for flat indexes without any JSON paths.

Part of #1012
---
 src/box/tuple_compare.cc     | 110 +++++++++++++++++++++++++----------
 src/box/tuple_extract_key.cc | 106 +++++++++++++++++++++------------
 src/box/tuple_hash.cc        |  38 ++++++++----
 3 files changed, 174 insertions(+), 80 deletions(-)

diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 7ab6e3bf6..db4bdfd92 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -458,11 +458,12 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	return i;
 }
 
-template<bool is_nullable, bool has_optional_parts>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
 tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		       struct key_def *key_def)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
@@ -470,7 +471,7 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
 	if (key_def->part_count == 1 && part->fieldno == 0 &&
-	    part->path == NULL) {
+	    (!has_json_paths || part->path == NULL)) {
 		/*
 		 * First field can not be optional - empty tuples
 		 * can not exist.
@@ -508,10 +509,17 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		end = part + key_def->part_count;
 
 	for (; part < end; part++) {
-		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
-						  field_map_a, part);
-		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
-						  field_map_b, part);
+		if (!has_json_paths) {
+			field_a = tuple_field_raw(format_a, tuple_a_raw,
+						  field_map_a, part->fieldno);
+			field_b = tuple_field_raw(format_b, tuple_b_raw,
+						  field_map_b, part->fieldno);
+		} else {
+			field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+							  field_map_a, part);
+			field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+							  field_map_b, part);
+		}
 		assert(has_optional_parts ||
 		       (field_a != NULL && field_b != NULL));
 		if (! is_nullable) {
@@ -558,10 +566,17 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	 */
 	end = key_def->parts + key_def->part_count;
 	for (; part < end; ++part) {
-		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
-						  field_map_a, part);
-		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
-						  field_map_b, part);
+		if (!has_json_paths) {
+			field_a = tuple_field_raw(format_a, tuple_a_raw,
+						  field_map_a, part->fieldno);
+			field_b = tuple_field_raw(format_b, tuple_b_raw,
+						  field_map_b, part->fieldno);
+		} else {
+			field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+							  field_map_a, part);
+			field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+							  field_map_b, part);
+		}
 		/*
 		 * Extended parts are primary, and they can not
 		 * be absent or be NULLs.
@@ -575,11 +590,12 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	return 0;
 }
 
-template<bool is_nullable, bool has_optional_parts>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
 tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 				uint32_t part_count, struct key_def *key_def)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
@@ -591,9 +607,14 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	const uint32_t *field_map = tuple_field_map(tuple);
 	enum mp_type a_type, b_type;
 	if (likely(part_count == 1)) {
-		const char *field =
-			tuple_field_by_part_raw(format, tuple_raw, field_map,
-						part);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, tuple_raw, field_map,
+						part->fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, tuple_raw,
+							field_map, part);
+		}
 		if (! is_nullable) {
 			return tuple_compare_field(field, key, part->type,
 						   part->coll);
@@ -617,9 +638,14 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	struct key_part *end = part + part_count;
 	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
-		const char *field =
-			tuple_field_by_part_raw(format, tuple_raw,
-						field_map, part);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, tuple_raw, field_map,
+						part->fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, tuple_raw,
+							field_map, part);
+		}
 		if (! is_nullable) {
 			rc = tuple_compare_field(field, key, part->type,
 						 part->coll);
@@ -1012,19 +1038,31 @@ static const comparator_signature cmp_arr[] = {
 
 #undef COMPARATOR
 
+static const tuple_compare_t compare_slowpath_funcs[] = {
+	tuple_compare_slowpath<false, false, false>,
+	tuple_compare_slowpath<true, false, false>,
+	tuple_compare_slowpath<false, true, false>,
+	tuple_compare_slowpath<true, true, false>,
+	tuple_compare_slowpath<false, false, true>,
+	tuple_compare_slowpath<true, false, true>,
+	tuple_compare_slowpath<false, true, true>,
+	tuple_compare_slowpath<true, true, true>
+};
+
 tuple_compare_t
 tuple_compare_create(const struct key_def *def)
 {
+	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+			   2 * (def->has_optional_parts ? 1 : 0) +
+			   4 * (def->has_json_paths ? 1 : 0);
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts)
 				return tuple_compare_sequential<true, true>;
 			else
 				return tuple_compare_sequential<true, false>;
-		} else if (def->has_optional_parts) {
-			return tuple_compare_slowpath<true, true>;
 		} else {
-			return tuple_compare_slowpath<true, false>;
+			return compare_slowpath_funcs[cmp_func_idx];
 		}
 	}
 	assert(! def->has_optional_parts);
@@ -1044,10 +1082,9 @@ tuple_compare_create(const struct key_def *def)
 				return cmp_arr[k].f;
 		}
 	}
-	if (key_def_is_sequential(def))
-		return tuple_compare_sequential<false, false>;
-	else
-		return tuple_compare_slowpath<false, false>;
+	return key_def_is_sequential(def) ?
+	       tuple_compare_sequential<false, false> :
+	       compare_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare */
@@ -1229,9 +1266,23 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
 
 #undef KEY_COMPARATOR
 
+static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
+	tuple_compare_with_key_slowpath<false, false, false>,
+	tuple_compare_with_key_slowpath<true, false, false>,
+	tuple_compare_with_key_slowpath<false, true, false>,
+	tuple_compare_with_key_slowpath<true, true, false>,
+	tuple_compare_with_key_slowpath<false, false, true>,
+	tuple_compare_with_key_slowpath<true, false, true>,
+	tuple_compare_with_key_slowpath<false, true, true>,
+	tuple_compare_with_key_slowpath<true, true, true>
+};
+
 tuple_compare_with_key_t
 tuple_compare_with_key_create(const struct key_def *def)
 {
+	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+			   2 * (def->has_optional_parts ? 1 : 0) +
+			   4 * (def->has_json_paths ? 1 : 0);
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts) {
@@ -1241,10 +1292,8 @@ tuple_compare_with_key_create(const struct key_def *def)
 				return tuple_compare_with_key_sequential<true,
 									 false>;
 			}
-		} else if (def->has_optional_parts) {
-			return tuple_compare_with_key_slowpath<true, true>;
 		} else {
-			return tuple_compare_with_key_slowpath<true, false>;
+			return compare_with_key_slowpath_funcs[cmp_func_idx];
 		}
 	}
 	assert(! def->has_optional_parts);
@@ -1267,10 +1316,9 @@ tuple_compare_with_key_create(const struct key_def *def)
 				return cmp_wk_arr[k].f;
 		}
 	}
-	if (key_def_is_sequential(def))
-		return tuple_compare_with_key_sequential<false, false>;
-	else
-		return tuple_compare_with_key_slowpath<false, false>;
+	return key_def_is_sequential(def) ?
+	       tuple_compare_with_key_sequential<false, false> :
+	       compare_with_key_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare_with_key */
diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index 1e8ec7588..0f55b8adb 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -5,13 +5,18 @@
 enum { MSGPACK_NULL = 0xc0 };
 
 /** True if key part i and i+1 are sequential. */
+template <bool has_json_paths>
 static inline bool
 key_def_parts_are_sequential(const struct key_def *def, int i)
 {
 	const struct key_part *part1 = &def->parts[i];
 	const struct key_part *part2 = &def->parts[i + 1];
-	return part1->fieldno + 1 == part2->fieldno &&
-	       part1->path == NULL && part2->path == NULL;
+	if (!has_json_paths) {
+		return part1->fieldno + 1 == part2->fieldno;
+	} else {
+		return part1->fieldno + 1 == part2->fieldno &&
+		       part1->path == NULL && part2->path == NULL;
+	}
 }
 
 /** True, if a key con contain two or more parts in sequence. */
@@ -19,7 +24,7 @@ static bool
 key_def_contains_sequential_parts(const struct key_def *def)
 {
 	for (uint32_t i = 0; i < def->part_count - 1; ++i) {
-		if (key_def_parts_are_sequential(def, i))
+		if (key_def_parts_are_sequential<true>(def, i))
 			return true;
 	}
 	return false;
@@ -99,11 +104,13 @@ tuple_extract_key_sequential(const struct tuple *tuple, struct key_def *key_def,
  * General-purpose implementation of tuple_extract_key()
  * @copydoc tuple_extract_key()
  */
-template <bool contains_sequential_parts, bool has_optional_parts>
+template <bool contains_sequential_parts, bool has_optional_parts,
+	  bool has_json_paths>
 static char *
 tuple_extract_key_slowpath(const struct tuple *tuple,
 			   struct key_def *key_def, uint32_t *key_size)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(contains_sequential_parts ==
@@ -118,9 +125,14 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 
 	/* Calculate the key size. */
 	for (uint32_t i = 0; i < part_count; ++i) {
-		const char *field =
-			tuple_field_by_part_raw(format, data, field_map,
-						&key_def->parts[i]);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, data, field_map,
+						key_def->parts[i].fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, data, field_map,
+							&key_def->parts[i]);
+		}
 		if (has_optional_parts && field == NULL) {
 			bsize += mp_sizeof_nil();
 			continue;
@@ -133,7 +145,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 			 * minimize tuple_field_raw() calls.
 			 */
 			for (; i < part_count - 1; i++) {
-				if (!key_def_parts_are_sequential(key_def, i)) {
+				if (!key_def_parts_are_sequential
+						<has_json_paths>(key_def, i)) {
 					/*
 					 * End of sequential part.
 					 */
@@ -159,9 +172,14 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 	}
 	char *key_buf = mp_encode_array(key, part_count);
 	for (uint32_t i = 0; i < part_count; ++i) {
-		const char *field =
-			tuple_field_by_part_raw(format, data, field_map,
-						&key_def->parts[i]);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, data, field_map,
+						key_def->parts[i].fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, data, field_map,
+							&key_def->parts[i]);
+		}
 		if (has_optional_parts && field == NULL) {
 			key_buf = mp_encode_nil(key_buf);
 			continue;
@@ -174,7 +192,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 			 * minimize tuple_field_raw() calls.
 			 */
 			for (; i < part_count - 1; i++) {
-				if (!key_def_parts_are_sequential(key_def, i)) {
+				if (!key_def_parts_are_sequential
+						<has_json_paths>(key_def, i)) {
 					/*
 					 * End of sequential part.
 					 */
@@ -207,11 +226,12 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
  * General-purpose version of tuple_extract_key_raw()
  * @copydoc tuple_extract_key_raw()
  */
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 static char *
 tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 			       struct key_def *key_def, uint32_t *key_size)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(mp_sizeof_nil() == 1);
@@ -239,7 +259,8 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 		uint32_t fieldno = key_def->parts[i].fieldno;
 		uint32_t null_count = 0;
 		for (; i < key_def->part_count - 1; i++) {
-			if (!key_def_parts_are_sequential(key_def, i))
+			if (!key_def_parts_are_sequential
+					<has_json_paths>(key_def, i))
 				break;
 		}
 		const struct key_part *part = &key_def->parts[i];
@@ -287,7 +308,7 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 		}
 		const char *src = field;
 		const char *src_end = field_end;
-		if (part->path != NULL) {
+		if (has_json_paths && part->path != NULL) {
 			if (tuple_field_go_to_path(&src, part->path,
 						   part->path_len) != 0) {
 				/*
@@ -320,6 +341,17 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 	return key;
 }
 
+static const tuple_extract_key_t extract_key_slowpath_funcs[] = {
+	tuple_extract_key_slowpath<false, false, false>,
+	tuple_extract_key_slowpath<true, false, false>,
+	tuple_extract_key_slowpath<false, true, false>,
+	tuple_extract_key_slowpath<true, true, false>,
+	tuple_extract_key_slowpath<false, false, true>,
+	tuple_extract_key_slowpath<true, false, true>,
+	tuple_extract_key_slowpath<false, true, true>,
+	tuple_extract_key_slowpath<true, true, true>
+};
+
 /**
  * Initialize tuple_extract_key() and tuple_extract_key_raw()
  */
@@ -340,32 +372,30 @@ tuple_extract_key_set(struct key_def *key_def)
 				tuple_extract_key_sequential_raw<false>;
 		}
 	} else {
-		if (key_def->has_optional_parts) {
-			assert(key_def->is_nullable);
-			if (key_def_contains_sequential_parts(key_def)) {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<true, true>;
-			} else {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<false, true>;
-			}
-		} else {
-			if (key_def_contains_sequential_parts(key_def)) {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<true, false>;
-			} else {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<false,
-								   false>;
-			}
-		}
+		int func_idx =
+			(key_def_contains_sequential_parts(key_def) ? 1 : 0) +
+			2 * (key_def->has_optional_parts ? 1 : 0) +
+			4 * (key_def->has_json_paths ? 1 : 0);
+		key_def->tuple_extract_key =
+			extract_key_slowpath_funcs[func_idx];
+		assert(!key_def->has_optional_parts || key_def->is_nullable);
 	}
 	if (key_def->has_optional_parts) {
 		assert(key_def->is_nullable);
-		key_def->tuple_extract_key_raw =
-			tuple_extract_key_slowpath_raw<true>;
+		if (key_def->has_json_paths) {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<true, true>;
+		} else {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<true, false>;
+		}
 	} else {
-		key_def->tuple_extract_key_raw =
-			tuple_extract_key_slowpath_raw<false>;
+		if (key_def->has_json_paths) {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<false, true>;
+		} else {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<false, false>;
+		}
 	}
 }
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index 825c3e5b3..19da43360 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -214,7 +214,7 @@ static const hasher_signature hash_arr[] = {
 
 #undef HASHER
 
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 uint32_t
 tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def);
 
@@ -257,10 +257,17 @@ tuple_hash_func_set(struct key_def *key_def) {
 	}
 
 slowpath:
-	if (key_def->has_optional_parts)
-		key_def->tuple_hash = tuple_hash_slowpath<true>;
-	else
-		key_def->tuple_hash = tuple_hash_slowpath<false>;
+	if (key_def->has_optional_parts) {
+		if (key_def->has_json_paths)
+			key_def->tuple_hash = tuple_hash_slowpath<true, true>;
+		else
+			key_def->tuple_hash = tuple_hash_slowpath<true, false>;
+	} else {
+		if (key_def->has_json_paths)
+			key_def->tuple_hash = tuple_hash_slowpath<false, true>;
+		else
+			key_def->tuple_hash = tuple_hash_slowpath<false, false>;
+	}
 	key_def->key_hash = key_hash_slowpath;
 }
 
@@ -348,10 +355,11 @@ tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry, const struct tuple *tuple,
 	return tuple_hash_field(ph1, pcarry, &field, part->coll);
 }
 
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 uint32_t
 tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	uint32_t h = HASH_SEED;
 	uint32_t carry = 0;
@@ -360,9 +368,13 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 	struct tuple_format *format = tuple_format(tuple);
 	const char *tuple_raw = tuple_data(tuple);
 	const uint32_t *field_map = tuple_field_map(tuple);
-	const char *field =
-		tuple_field_by_part_raw(format, tuple_raw, field_map,
-					key_def->parts);
+	const char *field;
+	if (!has_json_paths) {
+		field = tuple_field(tuple, prev_fieldno);
+	} else {
+		field = tuple_field_by_part_raw(format, tuple_raw, field_map,
+						key_def->parts);
+	}
 	const char *end = (char *)tuple + tuple_size(tuple);
 	if (has_optional_parts && field == NULL) {
 		total_size += tuple_hash_null(&h, &carry);
@@ -377,8 +389,12 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 		 */
 		if (prev_fieldno + 1 != key_def->parts[part_id].fieldno) {
 			struct key_part *part = &key_def->parts[part_id];
-			field = tuple_field_by_part_raw(format, tuple_raw,
-							field_map, part);
+			if (!has_json_paths) {
+				field = tuple_field(tuple, part->fieldno);
+			} else {
+				field = tuple_field_by_part_raw(format,
+						tuple_raw, field_map, part);
+			}
 		}
 		if (has_optional_parts && (field == NULL || field >= end)) {
 			total_size += tuple_hash_null(&h, &carry);
-- 
2.20.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 4/6] box: introduce has_json_paths flag in templates
  2019-02-03 10:20 ` [PATCH v9 4/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
@ 2019-02-04 12:31   ` Vladimir Davydov
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Davydov @ 2019-02-04 12:31 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Sun, Feb 03, 2019 at 01:20:24PM +0300, Kirill Shcherbatov wrote:
> Introduced has_json_path flag for compare, hash and extract
> functions templates(that are really hot) to make possible do not
> look to path field for flat indexes without any JSON paths.
> 
> Part of #1012
> ---
>  src/box/tuple_compare.cc     | 110 +++++++++++++++++++++++++----------
>  src/box/tuple_extract_key.cc | 106 +++++++++++++++++++++------------
>  src/box/tuple_hash.cc        |  38 ++++++++----
>  3 files changed, 174 insertions(+), 80 deletions(-)

Pushed to 2.1.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 5/6] box: introduce offset_slot cache in key_part
  2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov
                   ` (3 preceding siblings ...)
  2019-02-03 10:20 ` [PATCH v9 4/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
@ 2019-02-03 10:20 ` Kirill Shcherbatov
  2019-02-04 12:56   ` Vladimir Davydov
  2019-02-04 15:10   ` Vladimir Davydov
  2019-02-03 10:20 ` [PATCH v9 6/6] box: specify indexes in user-friendly form Kirill Shcherbatov
  5 siblings, 2 replies; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-03 10:20 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

tuple_field_by_part looks up the tuple_field corresponding to the
given key part in tuple_format in order to quickly retrieve the offset
of indexed data from the tuple field map. For regular indexes this
operation is blazing fast, however of JSON indexes it is not as we
have to parse the path to data and then do multiple lookups in a JSON
tree. Since tuple_field_by_part is used by comparators, we should
strive to make this routine as fast as possible for all kinds of
indexes.

This patch introduces an optimization that is supposed to make
tuple_field_by_part for JSON indexes as fast as it is for regular
indexes in most cases. We do that by caching the offset slot right in
key_part. There's a catch here however - we create a new format
whenever an index is dropped or created and we don't reindex old
tuples. As a result, there may be several generations of tuples in the
same space, all using different formats while there's the only key_def
used for comparison.

To overcome this problem, we introduce the notion of tuple_format
epoch. This is a counter incremented each time a new format is
created. We store it in tuple_format and key_def, and we only use
the offset slot cached in a key_def if it's epoch coincides with the
epoch of the tuple format. If they don't, we look up a tuple_field as
before, and then update the cached value provided the epoch of the
tuple format.

Part of #1012
---
 src/box/key_def.c      | 15 ++++++++++-----
 src/box/key_def.h      | 14 ++++++++++++++
 src/box/tuple.h        |  2 +-
 src/box/tuple_format.c |  6 +++++-
 src/box/tuple_format.h | 41 +++++++++++++++++++++++++++++++++++------
 5 files changed, 65 insertions(+), 13 deletions(-)

diff --git a/src/box/key_def.c b/src/box/key_def.c
index 1b00945ca..9411ade39 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -142,7 +142,8 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 		 enum field_type type, enum on_conflict_action nullable_action,
 		 struct coll *coll, uint32_t coll_id,
 		 enum sort_order sort_order, const char *path,
-		 uint32_t path_len, char **path_pool)
+		 uint32_t path_len, char **path_pool, int32_t offset_slot,
+		 uint64_t format_epoch)
 {
 	assert(part_no < def->part_count);
 	assert(type < field_type_MAX);
@@ -154,6 +155,8 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 	def->parts[part_no].coll = coll;
 	def->parts[part_no].coll_id = coll_id;
 	def->parts[part_no].sort_order = sort_order;
+	def->parts[part_no].offset_slot_cache = offset_slot;
+	def->parts[part_no].format_epoch = format_epoch;
 	if (path != NULL) {
 		assert(path_pool != NULL);
 		def->parts[part_no].path = *path_pool;
@@ -202,7 +205,7 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count)
 		key_def_set_part(def, i, part->fieldno, part->type,
 				 part->nullable_action, coll, part->coll_id,
 				 part->sort_order, part->path, path_len,
-				 &path_pool);
+				 &path_pool, TUPLE_OFFSET_SLOT_NIL, 0);
 	}
 	assert(path_pool == (char *)def + sz);
 	key_def_set_cmp(def);
@@ -256,7 +259,7 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 				 (enum field_type)types[item],
 				 ON_CONFLICT_ACTION_DEFAULT,
 				 NULL, COLL_NONE, SORT_ORDER_ASC, NULL, 0,
-				 NULL);
+				 NULL, TUPLE_OFFSET_SLOT_NIL, 0);
 	}
 	key_def_set_cmp(key_def);
 	return key_def;
@@ -666,7 +669,8 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
 				 part->coll_id, part->sort_order, part->path,
-				 part->path_len, &path_pool);
+				 part->path_len, &path_pool,
+				 part->offset_slot_cache, part->format_epoch);
 	}
 
 	/* Set-append second key def's part to the new key def. */
@@ -678,7 +682,8 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
 				 part->coll_id, part->sort_order, part->path,
-				 part->path_len, &path_pool);
+				 part->path_len, &path_pool,
+				 part->offset_slot_cache, part->format_epoch);
 	}
 	assert(path_pool == (char *)new_def + sz);
 	key_def_set_cmp(new_def);
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 678d1f070..85bed92bb 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -97,6 +97,20 @@ struct key_part {
 	char *path;
 	/** The length of JSON path. */
 	uint32_t path_len;
+	/**
+	 * Epoch of the tuple format the offset slot cached in
+	 * this part is valid for, see tuple_format::epoch.
+	 */
+	uint64_t format_epoch;
+	/**
+	 * Cached value of the offset slot corresponding to
+	 * the indexed field (tuple_field::offset_slot).
+	 * Valid only if key_part::format_epoch equals the epoch
+	 * of the tuple format. This value is updated in
+	 * tuple_field_by_part_raw to always store the
+	 * offset corresponding to the last used tuple format.
+	 */
+	int32_t offset_slot_cache;
 };
 
 struct key_def;
diff --git a/src/box/tuple.h b/src/box/tuple.h
index c3cd689fd..d2da26713 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -528,7 +528,7 @@ tuple_field_by_path(const struct tuple *tuple, uint32_t fieldno,
 {
 	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
 				       tuple_field_map(tuple), fieldno,
-				       path, path_len);
+				       path, path_len, NULL);
 }
 
 /**
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index d9c408495..fc152cbbc 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -41,6 +41,7 @@ struct tuple_format **tuple_formats;
 static intptr_t recycled_format_ids = FORMAT_ID_NIL;
 
 static uint32_t formats_size = 0, formats_capacity = 0;
+static uint64_t formats_epoch = 0;
 
 /**
  * Find in format1::fields the field by format2_field's JSON path.
@@ -623,6 +624,7 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 	format->index_field_count = index_field_count;
 	format->exact_field_count = 0;
 	format->min_field_count = 0;
+	format->epoch = 0;
 	return format;
 error:
 	tuple_format_destroy_fields(format);
@@ -672,6 +674,7 @@ tuple_format_reuse(struct tuple_format **p_format)
 		tuple_format_destroy(format);
 		free(format);
 		*p_format = *entry;
+		(*p_format)->epoch = ++formats_epoch;
 		return true;
 	}
 	return false;
@@ -740,6 +743,7 @@ tuple_format_new(struct tuple_format_vtab *vtab, void *engine,
 		goto err;
 	if (tuple_format_reuse(&format))
 		return format;
+	format->epoch = ++formats_epoch;
 	if (tuple_format_register(format) < 0)
 		goto err;
 	if (tuple_format_add_to_hash(format) < 0) {
@@ -1205,5 +1209,5 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
 	}
 	return tuple_field_raw_by_path(format, tuple, field_map, fieldno,
 				       path + lexer.offset,
-				       path_len - lexer.offset);
+				       path_len - lexer.offset, NULL);
 }
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index d4b53195b..aedd3e915 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -137,6 +137,12 @@ tuple_field_is_nullable(struct tuple_field *tuple_field)
  * Tuple format describes how tuple is stored and information about its fields
  */
 struct tuple_format {
+	/**
+	 * Counter that grows incrementally on space rebuild
+	 * used for caching offset slot in key_part, for more
+	 * details see key_part::offset_slot_cache.
+	 */
+	uint64_t epoch;
 	/** Virtual function table */
 	struct tuple_format_vtab vtab;
 	/** Pointer to engine-specific data. */
@@ -421,26 +427,43 @@ tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len);
  * @param field_map Tuple field map.
  * @param path Relative JSON path to field.
  * @param path_len Length of @a path.
+ * @param offset_slot_hint The pointer to a variable that contains
+ *                         an offset slot. May be NULL.
+ *                         If specified AND value by pointer is
+ *                         not TUPLE_OFFSET_SLOT_NIL is used to
+ *                         access data in a single operation.
+ *                         Else it is initialized with offset_slot
+ *                         of format field by path.
  */
 static inline const char *
 tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			const uint32_t *field_map, uint32_t fieldno,
-			const char *path, uint32_t path_len)
+			const char *path, uint32_t path_len,
+			int32_t *offset_slot_hint)
 {
+	int32_t offset_slot;
+	if (offset_slot_hint != NULL &&
+	    *offset_slot_hint != TUPLE_OFFSET_SLOT_NIL) {
+		offset_slot = *offset_slot_hint;
+		goto offset_slot_access;
+	}
 	if (likely(fieldno < format->index_field_count)) {
+		struct tuple_field *field;
 		if (path == NULL && fieldno == 0) {
 			mp_decode_array(&tuple);
 			return tuple;
 		}
-		struct tuple_field *field =
-			tuple_format_field_by_path(format, fieldno, path,
+		field = tuple_format_field_by_path(format, fieldno, path,
 						   path_len);
 		assert(field != NULL || path != NULL);
 		if (path != NULL && field == NULL)
 			goto parse;
-		int32_t offset_slot = field->offset_slot;
+		offset_slot = field->offset_slot;
 		if (offset_slot == TUPLE_OFFSET_SLOT_NIL)
 			goto parse;
+		if (offset_slot_hint != NULL)
+			*offset_slot_hint = offset_slot;
+offset_slot_access:
 		/* Indexed field */
 		if (field_map[offset_slot] == 0)
 			return NULL;
@@ -478,7 +501,7 @@ tuple_field_raw(struct tuple_format *format, const char *tuple,
 		const uint32_t *field_map, uint32_t field_no)
 {
 	return tuple_field_raw_by_path(format, tuple, field_map, field_no,
-				       NULL, 0);
+				       NULL, 0, NULL);
 }
 
 /**
@@ -513,8 +536,14 @@ static inline const char *
 tuple_field_by_part_raw(struct tuple_format *format, const char *data,
 			const uint32_t *field_map, struct key_part *part)
 {
+	if (unlikely(part->format_epoch != format->epoch)) {
+		assert(format->epoch != 0);
+		part->format_epoch = format->epoch;
+		part->offset_slot_cache = TUPLE_OFFSET_SLOT_NIL;
+	}
 	return tuple_field_raw_by_path(format, data, field_map, part->fieldno,
-				       part->path, part->path_len);
+				       part->path, part->path_len,
+				       &part->offset_slot_cache);
 }
 
 /**
-- 
2.20.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 5/6] box: introduce offset_slot cache in key_part
  2019-02-03 10:20 ` [PATCH v9 5/6] box: introduce offset_slot cache in key_part Kirill Shcherbatov
@ 2019-02-04 12:56   ` Vladimir Davydov
  2019-02-04 13:02     ` [tarantool-patches] " Kirill Shcherbatov
  2019-02-04 15:10   ` Vladimir Davydov
  1 sibling, 1 reply; 15+ messages in thread
From: Vladimir Davydov @ 2019-02-04 12:56 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Sun, Feb 03, 2019 at 01:20:25PM +0300, Kirill Shcherbatov wrote:
> diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
> index d9c408495..fc152cbbc 100644
> --- a/src/box/tuple_format.c
> +++ b/src/box/tuple_format.c
> @@ -41,6 +41,7 @@ struct tuple_format **tuple_formats;
>  static intptr_t recycled_format_ids = FORMAT_ID_NIL;
>  
>  static uint32_t formats_size = 0, formats_capacity = 0;
> +static uint64_t formats_epoch = 0;
>  
>  /**
>   * Find in format1::fields the field by format2_field's JSON path.
> @@ -623,6 +624,7 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
>  	format->index_field_count = index_field_count;
>  	format->exact_field_count = 0;
>  	format->min_field_count = 0;
> +	format->epoch = 0;
>  	return format;
>  error:
>  	tuple_format_destroy_fields(format);
> @@ -672,6 +674,7 @@ tuple_format_reuse(struct tuple_format **p_format)
>  		tuple_format_destroy(format);
>  		free(format);
>  		*p_format = *entry;
> +		(*p_format)->epoch = ++formats_epoch;

Why? Can I remove it?

>  		return true;
>  	}
>  	return false;
> @@ -740,6 +743,7 @@ tuple_format_new(struct tuple_format_vtab *vtab, void *engine,
>  		goto err;
>  	if (tuple_format_reuse(&format))
>  		return format;
> +	format->epoch = ++formats_epoch;

Can I move it above the call to tuple_format_create, where other
tuple_format fields are initialized?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [tarantool-patches] Re: [PATCH v9 5/6] box: introduce offset_slot cache in key_part
  2019-02-04 12:56   ` Vladimir Davydov
@ 2019-02-04 13:02     ` Kirill Shcherbatov
  0 siblings, 0 replies; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-04 13:02 UTC (permalink / raw)
  To: tarantool-patches, Vladimir Davydov

> Can I move it above the call to tuple_format_create, where other
> tuple_format fields are initialized?
If format reuse is successful, the global variable for the epoch will be increased in vain.
I wanted to avoid it.

> Why? Can I remove it?
If you are not concerned about the leak of this number, you can bear it before calling tuple_format_reuse.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 5/6] box: introduce offset_slot cache in key_part
  2019-02-03 10:20 ` [PATCH v9 5/6] box: introduce offset_slot cache in key_part Kirill Shcherbatov
  2019-02-04 12:56   ` Vladimir Davydov
@ 2019-02-04 15:10   ` Vladimir Davydov
  1 sibling, 0 replies; 15+ messages in thread
From: Vladimir Davydov @ 2019-02-04 15:10 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Sun, Feb 03, 2019 at 01:20:25PM +0300, Kirill Shcherbatov wrote:
> tuple_field_by_part looks up the tuple_field corresponding to the
> given key part in tuple_format in order to quickly retrieve the offset
> of indexed data from the tuple field map. For regular indexes this
> operation is blazing fast, however of JSON indexes it is not as we
> have to parse the path to data and then do multiple lookups in a JSON
> tree. Since tuple_field_by_part is used by comparators, we should
> strive to make this routine as fast as possible for all kinds of
> indexes.
> 
> This patch introduces an optimization that is supposed to make
> tuple_field_by_part for JSON indexes as fast as it is for regular
> indexes in most cases. We do that by caching the offset slot right in
> key_part. There's a catch here however - we create a new format
> whenever an index is dropped or created and we don't reindex old
> tuples. As a result, there may be several generations of tuples in the
> same space, all using different formats while there's the only key_def
> used for comparison.
> 
> To overcome this problem, we introduce the notion of tuple_format
> epoch. This is a counter incremented each time a new format is
> created. We store it in tuple_format and key_def, and we only use
> the offset slot cached in a key_def if it's epoch coincides with the
> epoch of the tuple format. If they don't, we look up a tuple_field as
> before, and then update the cached value provided the epoch of the
> tuple format.
> 
> Part of #1012
> ---
>  src/box/key_def.c      | 15 ++++++++++-----
>  src/box/key_def.h      | 14 ++++++++++++++
>  src/box/tuple.h        |  2 +-
>  src/box/tuple_format.c |  6 +++++-
>  src/box/tuple_format.h | 41 +++++++++++++++++++++++++++++++++++------
>  5 files changed, 65 insertions(+), 13 deletions(-)

Pushed to 2.1 with the following minor changes:

diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index fc152cbb..2d9b71ee 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -674,7 +674,6 @@ tuple_format_reuse(struct tuple_format **p_format)
 		tuple_format_destroy(format);
 		free(format);
 		*p_format = *entry;
-		(*p_format)->epoch = ++formats_epoch;
 		return true;
 	}
 	return false;
@@ -738,12 +737,12 @@ tuple_format_new(struct tuple_format_vtab *vtab, void *engine,
 	format->is_temporary = is_temporary;
 	format->is_ephemeral = is_ephemeral;
 	format->exact_field_count = exact_field_count;
+	format->epoch = ++formats_epoch;
 	if (tuple_format_create(format, keys, key_count, space_fields,
 				space_field_count) < 0)
 		goto err;
 	if (tuple_format_reuse(&format))
 		return format;
-	format->epoch = ++formats_epoch;
 	if (tuple_format_register(format) < 0)
 		goto err;
 	if (tuple_format_add_to_hash(format) < 0) {
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index aedd3e91..01ed97ae 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -137,12 +137,6 @@ tuple_field_is_nullable(struct tuple_field *tuple_field)
  * Tuple format describes how tuple is stored and information about its fields
  */
 struct tuple_format {
-	/**
-	 * Counter that grows incrementally on space rebuild
-	 * used for caching offset slot in key_part, for more
-	 * details see key_part::offset_slot_cache.
-	 */
-	uint64_t epoch;
 	/** Virtual function table */
 	struct tuple_format_vtab vtab;
 	/** Pointer to engine-specific data. */
@@ -155,6 +149,12 @@ struct tuple_format {
 	 * ephemeral spaces.
 	 */
 	uint32_t hash;
+	/**
+	 * Counter that grows incrementally on space rebuild
+	 * used for caching offset slot in key_part, for more
+	 * details see key_part::offset_slot_cache.
+	 */
+	uint64_t epoch;
 	/** Reference counter */
 	int refs;
 	/**

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v9 6/6] box: specify indexes in user-friendly form
  2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov
                   ` (4 preceding siblings ...)
  2019-02-03 10:20 ` [PATCH v9 5/6] box: introduce offset_slot cache in key_part Kirill Shcherbatov
@ 2019-02-03 10:20 ` Kirill Shcherbatov
  2019-02-04 15:30   ` Vladimir Davydov
  5 siblings, 1 reply; 15+ messages in thread
From: Kirill Shcherbatov @ 2019-02-03 10:20 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: Kirill Shcherbatov

Implemented a more convenient interface for creating an index
by JSON path. Instead of specifying fieldno and relative path
it is now possible to pass full JSON path to data.

Closes #1012

@TarantoolBot document
Title: Indexes by JSON path
Sometimes field data could have complex document structure.
When this structure is consistent across whole space,
you are able to create an index by JSON path.

Example:
s = box.schema.space.create('sample')
format = {{'id', 'unsigned'}, {'data', 'map'}}
s:format(format)
-- explicit JSON index creation
age_idx = s:create_index('age', {{2, 'number', path = "age"}})
-- user-friendly syntax for JSON index creation
parts = {{'data.FIO["fname"]', 'str'}, {'data.FIO["sname"]', 'str'},
     {'data.age', 'number'}}
info_idx = s:create_index('info', {parts = parts}})
s:insert({1, {FIO={fname="James", sname="Bond"}, age=35}})
---
 src/box/lua/schema.lua    | 100 ++++++++++++++++++++++++++++++--------
 test/engine/json.result   |  94 +++++++++++++++++++++++++++++++++++
 test/engine/json.test.lua |  27 ++++++++++
 3 files changed, 202 insertions(+), 19 deletions(-)

diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
index 8a804f0ba..aa9fd4b96 100644
--- a/src/box/lua/schema.lua
+++ b/src/box/lua/schema.lua
@@ -575,6 +575,78 @@ local function update_index_parts_1_6_0(parts)
     return result
 end
 
+--
+-- Get field index by format field name.
+--
+local function format_field_index_by_name(format, name)
+    for k, v in pairs(format) do
+        if v.name == name then
+            return k
+        end
+    end
+    return nil
+end
+
+--
+-- Get field 0-based index and relative JSON path to data by
+-- field 1-based index or full JSON path. A particular case of a
+-- full JSON path is the format field name.
+--
+local function format_field_resolve(format, path, part_idx)
+    assert(type(path) == 'number' or type(path) == 'string')
+    local idx = nil
+    local relative_path = nil
+    local field_name = nil
+    -- Path doesn't require resolve.
+    if type(path) == 'number' then
+        idx = path
+        goto done
+    end
+    -- An attempt to interpret a path as the full field name.
+    idx = format_field_index_by_name(format, path)
+    if idx ~= nil then
+        relative_path = nil
+        goto done
+    end
+    -- Check if the initial part of the JSON path is a token of
+    -- the form [%d].
+    field_name = string.match(path, "^%[(%d+)%]")
+    idx = tonumber(field_name)
+    if idx ~= nil then
+        relative_path = string.sub(path, string.len(field_name) + 3)
+        goto done
+    end
+    -- Check if the initial part of the JSON path is a token of
+    -- the form ["%s"] or ['%s'].
+    field_name = string.match(path, '^%[\"([^%]]+)\"%].*') or
+                 string.match(path, "^%[\'([^%]]+)\'%].*")
+    idx = format_field_index_by_name(format, field_name)
+    if idx ~= nil then
+        relative_path = string.sub(path, string.len(field_name) + 5)
+        goto done
+    end
+    -- Check if the initial part of the JSON path is a string
+    -- token: assume that it ends with .*[ or .*.
+    field_name = string.match(path, "^([^.[]+)")
+    idx = format_field_index_by_name(format, field_name)
+    if idx ~= nil then
+        relative_path = string.sub(path, string.len(field_name) + 1)
+        goto done
+    end
+    -- Can't resolve field index by path.
+    assert(idx == nil)
+    box.error(box.error.ILLEGAL_PARAMS, "options.parts[" .. part_idx .. "]: " ..
+              "field was not found by name '" .. path .. "'")
+
+::done::
+    if idx <= 0 then
+        box.error(box.error.ILLEGAL_PARAMS,
+                  "options.parts[" .. part_idx .. "]: " ..
+                  "field (number) must be one-based")
+    end
+    return idx - 1, relative_path
+end
+
 local function update_index_parts(format, parts)
     if type(parts) ~= "table" then
         box.error(box.error.ILLEGAL_PARAMS,
@@ -622,25 +694,16 @@ local function update_index_parts(format, parts)
                 end
             end
         end
-        if type(part.field) ~= 'number' and type(part.field) ~= 'string' then
-            box.error(box.error.ILLEGAL_PARAMS,
-                      "options.parts[" .. i .. "]: field (name or number) is expected")
-        elseif type(part.field) == 'string' then
-            for k,v in pairs(format) do
-                if v.name == part.field then
-                    part.field = k
-                    break
-                end
-            end
-            if type(part.field) == 'string' then
-                box.error(box.error.ILLEGAL_PARAMS,
-                          "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")
-            end
-        elseif part.field == 0 then
-            box.error(box.error.ILLEGAL_PARAMS,
-                      "options.parts[" .. i .. "]: field (number) must be one-based")
+        if type(part.field) == 'number' or type(part.field) == 'string' then
+            local idx, path = format_field_resolve(format, part.field, i)
+            part.field = idx
+            part.path = path or part.path
+            parts_can_be_simplified = parts_can_be_simplified and part.path == nil
+        else
+            box.error(box.error.ILLEGAL_PARAMS, "options.parts[" .. i .. "]: "  ..
+                      "field (name or number) is expected")
         end
-        local fmt = format[part.field]
+        local fmt = format[part.field + 1]
         if part.type == nil then
             if fmt and fmt.type then
                 part.type = fmt.type
@@ -666,7 +729,6 @@ local function update_index_parts(format, parts)
                 parts_can_be_simplified = false
             end
         end
-        part.field = part.field - 1
         table.insert(result, part)
     end
     return result, parts_can_be_simplified
diff --git a/test/engine/json.result b/test/engine/json.result
index 3a5f472bc..2d50976e3 100644
--- a/test/engine/json.result
+++ b/test/engine/json.result
@@ -122,6 +122,100 @@ idx:max()
 s:drop()
 ---
 ...
+-- Test user-friendly index creation interface.
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+format = {{'data', 'map'}, {'meta', 'str'}}
+---
+...
+s:format(format)
+---
+...
+s:create_index('pk_invalid', {parts = {{']sad.FIO["sname"]', 'str'}}})
+---
+- error: 'Illegal parameters, options.parts[1]: field was not found by name '']sad.FIO["sname"]'''
+...
+s:create_index('pk_unexistent', {parts = {{'unexistent.FIO["sname"]', 'str'}}})
+---
+- error: 'Illegal parameters, options.parts[1]: field was not found by name ''unexistent.FIO["sname"]'''
+...
+pk = s:create_index('pk', {parts = {{'data.FIO["sname"]', 'str'}}})
+---
+...
+pk ~= nil
+---
+- true
+...
+sk2 = s:create_index('sk2', {parts = {{'["data"]FIO["sname"]', 'str'}}})
+---
+...
+sk2 ~= nil
+---
+- true
+...
+sk3 = s:create_index('sk3', {parts = {{'[\'data\']FIO["sname"]', 'str'}}})
+---
+...
+sk3 ~= nil
+---
+- true
+...
+sk4 = s:create_index('sk4', {parts = {{'[1]FIO["sname"]', 'str'}}})
+---
+...
+sk4 ~= nil
+---
+- true
+...
+pk.fieldno == sk2.fieldno
+---
+- true
+...
+sk2.fieldno == sk3.fieldno
+---
+- true
+...
+sk3.fieldno == sk4.fieldno
+---
+- true
+...
+pk.path == sk2.path
+---
+- true
+...
+sk2.path == sk3.path
+---
+- true
+...
+sk3.path == sk4.path
+---
+- true
+...
+s:insert{{town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, "mi6"}
+---
+- [{'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 'mi6']
+...
+s:insert{{town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, "test"}
+---
+- [{'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+  'test']
+...
+pk:get({'Bond'}) == sk2:get({'Bond'})
+---
+- true
+...
+sk2:get({'Bond'}) == sk3:get({'Bond'})
+---
+- true
+...
+sk3:get({'Bond'}) == sk4:get({'Bond'})
+---
+- true
+...
+s:drop()
+---
+...
 -- Test upsert of JSON-indexed data.
 s = box.schema.create_space('withdata', {engine = engine})
 ---
diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
index 181eae02c..7a67e341e 100644
--- a/test/engine/json.test.lua
+++ b/test/engine/json.test.lua
@@ -34,6 +34,33 @@ idx:min()
 idx:max()
 s:drop()
 
+-- Test user-friendly index creation interface.
+s = box.schema.space.create('withdata', {engine = engine})
+format = {{'data', 'map'}, {'meta', 'str'}}
+s:format(format)
+s:create_index('pk_invalid', {parts = {{']sad.FIO["sname"]', 'str'}}})
+s:create_index('pk_unexistent', {parts = {{'unexistent.FIO["sname"]', 'str'}}})
+pk = s:create_index('pk', {parts = {{'data.FIO["sname"]', 'str'}}})
+pk ~= nil
+sk2 = s:create_index('sk2', {parts = {{'["data"]FIO["sname"]', 'str'}}})
+sk2 ~= nil
+sk3 = s:create_index('sk3', {parts = {{'[\'data\']FIO["sname"]', 'str'}}})
+sk3 ~= nil
+sk4 = s:create_index('sk4', {parts = {{'[1]FIO["sname"]', 'str'}}})
+sk4 ~= nil
+pk.fieldno == sk2.fieldno
+sk2.fieldno == sk3.fieldno
+sk3.fieldno == sk4.fieldno
+pk.path == sk2.path
+sk2.path == sk3.path
+sk3.path == sk4.path
+s:insert{{town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, "mi6"}
+s:insert{{town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, "test"}
+pk:get({'Bond'}) == sk2:get({'Bond'})
+sk2:get({'Bond'}) == sk3:get({'Bond'})
+sk3:get({'Bond'}) == sk4:get({'Bond'})
+s:drop()
+
 -- Test upsert of JSON-indexed data.
 s = box.schema.create_space('withdata', {engine = engine})
 parts = {}
-- 
2.20.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v9 6/6] box: specify indexes in user-friendly form
  2019-02-03 10:20 ` [PATCH v9 6/6] box: specify indexes in user-friendly form Kirill Shcherbatov
@ 2019-02-04 15:30   ` Vladimir Davydov
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Davydov @ 2019-02-04 15:30 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Sun, Feb 03, 2019 at 01:20:26PM +0300, Kirill Shcherbatov wrote:
> Implemented a more convenient interface for creating an index
> by JSON path. Instead of specifying fieldno and relative path
> it is now possible to pass full JSON path to data.
> 
> Closes #1012
> 
> @TarantoolBot document
> Title: Indexes by JSON path
> Sometimes field data could have complex document structure.
> When this structure is consistent across whole space,
> you are able to create an index by JSON path.
> 
> Example:
> s = box.schema.space.create('sample')
> format = {{'id', 'unsigned'}, {'data', 'map'}}
> s:format(format)
> -- explicit JSON index creation
> age_idx = s:create_index('age', {{2, 'number', path = "age"}})
> -- user-friendly syntax for JSON index creation
> parts = {{'data.FIO["fname"]', 'str'}, {'data.FIO["sname"]', 'str'},
>      {'data.age', 'number'}}
> info_idx = s:create_index('info', {parts = parts}})
> s:insert({1, {FIO={fname="James", sname="Bond"}, age=35}})
> ---
>  src/box/lua/schema.lua    | 100 ++++++++++++++++++++++++++++++--------
>  test/engine/json.result   |  94 +++++++++++++++++++++++++++++++++++
>  test/engine/json.test.lua |  27 ++++++++++
>  3 files changed, 202 insertions(+), 19 deletions(-)
> 
> diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
> index 8a804f0ba..aa9fd4b96 100644
> --- a/src/box/lua/schema.lua
> +++ b/src/box/lua/schema.lua
> @@ -575,6 +575,78 @@ local function update_index_parts_1_6_0(parts)
>      return result
>  end
>  
> +--
> +-- Get field index by format field name.
> +--
> +local function format_field_index_by_name(format, name)
> +    for k, v in pairs(format) do
> +        if v.name == name then
> +            return k
> +        end
> +    end
> +    return nil
> +end
> +
> +--
> +-- Get field 0-based index and relative JSON path to data by
> +-- field 1-based index or full JSON path. A particular case of a
> +-- full JSON path is the format field name.
> +--
> +local function format_field_resolve(format, path, part_idx)
> +    assert(type(path) == 'number' or type(path) == 'string')
> +    local idx = nil
> +    local relative_path = nil
> +    local field_name = nil
> +    -- Path doesn't require resolve.
> +    if type(path) == 'number' then
> +        idx = path
> +        goto done
> +    end
> +    -- An attempt to interpret a path as the full field name.
> +    idx = format_field_index_by_name(format, path)
> +    if idx ~= nil then
> +        relative_path = nil
> +        goto done
> +    end
> +    -- Check if the initial part of the JSON path is a token of
> +    -- the form [%d].
> +    field_name = string.match(path, "^%[(%d+)%]")
> +    idx = tonumber(field_name)
> +    if idx ~= nil then
> +        relative_path = string.sub(path, string.len(field_name) + 3)
> +        goto done
> +    end
> +    -- Check if the initial part of the JSON path is a token of
> +    -- the form ["%s"] or ['%s'].
> +    field_name = string.match(path, '^%[\"([^%]]+)\"%].*') or
> +                 string.match(path, "^%[\'([^%]]+)\'%].*")

Masking quotes is redundant here, as well as .* at the end of the regexp.
I removed them.

> diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
> index 181eae02c..7a67e341e 100644
> --- a/test/engine/json.test.lua
> +++ b/test/engine/json.test.lua
> @@ -34,6 +34,33 @@ idx:min()
>  idx:max()
>  s:drop()
>  
> +-- Test user-friendly index creation interface.
> +s = box.schema.space.create('withdata', {engine = engine})
> +format = {{'data', 'map'}, {'meta', 'str'}}
> +s:format(format)
> +s:create_index('pk_invalid', {parts = {{']sad.FIO["sname"]', 'str'}}})
> +s:create_index('pk_unexistent', {parts = {{'unexistent.FIO["sname"]', 'str'}}})
> +pk = s:create_index('pk', {parts = {{'data.FIO["sname"]', 'str'}}})
> +pk ~= nil
> +sk2 = s:create_index('sk2', {parts = {{'["data"]FIO["sname"]', 'str'}}})
> +sk2 ~= nil
> +sk3 = s:create_index('sk3', {parts = {{'[\'data\']FIO["sname"]', 'str'}}})
> +sk3 ~= nil
> +sk4 = s:create_index('sk4', {parts = {{'[1]FIO["sname"]', 'str'}}})

Strictly speaking, this isn't a correct JSON path - a dot is missing
before FIO. I guess it's OK that we allow this, but in tests we'd better
use the correct form so as not to confuse the reader. I added the
missing dots.

Pushed to 2.1. My incremental diff is below.

diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
index aa9fd4b9..34428c91 100644
--- a/src/box/lua/schema.lua
+++ b/src/box/lua/schema.lua
@@ -618,8 +618,8 @@ local function format_field_resolve(format, path, part_idx)
     end
     -- Check if the initial part of the JSON path is a token of
     -- the form ["%s"] or ['%s'].
-    field_name = string.match(path, '^%[\"([^%]]+)\"%].*') or
-                 string.match(path, "^%[\'([^%]]+)\'%].*")
+    field_name = string.match(path, '^%["([^%]]+)"%]') or
+                 string.match(path, "^%['([^%]]+)'%]")
     idx = format_field_index_by_name(format, field_name)
     if idx ~= nil then
         relative_path = string.sub(path, string.len(field_name) + 5)
diff --git a/test/engine/json.result b/test/engine/json.result
index 2d50976e..1bac85ed 100644
--- a/test/engine/json.result
+++ b/test/engine/json.result
@@ -147,21 +147,21 @@ pk ~= nil
 ---
 - true
 ...
-sk2 = s:create_index('sk2', {parts = {{'["data"]FIO["sname"]', 'str'}}})
+sk2 = s:create_index('sk2', {parts = {{'["data"].FIO["sname"]', 'str'}}})
 ---
 ...
 sk2 ~= nil
 ---
 - true
 ...
-sk3 = s:create_index('sk3', {parts = {{'[\'data\']FIO["sname"]', 'str'}}})
+sk3 = s:create_index('sk3', {parts = {{'[\'data\'].FIO["sname"]', 'str'}}})
 ---
 ...
 sk3 ~= nil
 ---
 - true
 ...
-sk4 = s:create_index('sk4', {parts = {{'[1]FIO["sname"]', 'str'}}})
+sk4 = s:create_index('sk4', {parts = {{'[1].FIO["sname"]', 'str'}}})
 ---
 ...
 sk4 ~= nil
diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
index 7a67e341..9afa3daa 100644
--- a/test/engine/json.test.lua
+++ b/test/engine/json.test.lua
@@ -42,11 +42,11 @@ s:create_index('pk_invalid', {parts = {{']sad.FIO["sname"]', 'str'}}})
 s:create_index('pk_unexistent', {parts = {{'unexistent.FIO["sname"]', 'str'}}})
 pk = s:create_index('pk', {parts = {{'data.FIO["sname"]', 'str'}}})
 pk ~= nil
-sk2 = s:create_index('sk2', {parts = {{'["data"]FIO["sname"]', 'str'}}})
+sk2 = s:create_index('sk2', {parts = {{'["data"].FIO["sname"]', 'str'}}})
 sk2 ~= nil
-sk3 = s:create_index('sk3', {parts = {{'[\'data\']FIO["sname"]', 'str'}}})
+sk3 = s:create_index('sk3', {parts = {{'[\'data\'].FIO["sname"]', 'str'}}})
 sk3 ~= nil
-sk4 = s:create_index('sk4', {parts = {{'[1]FIO["sname"]', 'str'}}})
+sk4 = s:create_index('sk4', {parts = {{'[1].FIO["sname"]', 'str'}}})
 sk4 ~= nil
 pk.fieldno == sk2.fieldno
 sk2.fieldno == sk3.fieldno

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-02-04 15:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-03 10:20 [PATCH v9 0/6] box: Indexes by JSON path Kirill Shcherbatov
2019-02-03 10:20 ` [PATCH v9 1/6] lib: update msgpuck library Kirill Shcherbatov
2019-02-04  9:48   ` Vladimir Davydov
2019-02-03 10:20 ` [PATCH v9 2/6] box: introduce tuple_field_raw_by_path routine Kirill Shcherbatov
2019-02-04 10:37   ` Vladimir Davydov
2019-02-03 10:20 ` [PATCH v9 3/6] box: introduce JSON Indexes Kirill Shcherbatov
2019-02-04 12:26   ` Vladimir Davydov
2019-02-03 10:20 ` [PATCH v9 4/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
2019-02-04 12:31   ` Vladimir Davydov
2019-02-03 10:20 ` [PATCH v9 5/6] box: introduce offset_slot cache in key_part Kirill Shcherbatov
2019-02-04 12:56   ` Vladimir Davydov
2019-02-04 13:02     ` [tarantool-patches] " Kirill Shcherbatov
2019-02-04 15:10   ` Vladimir Davydov
2019-02-03 10:20 ` [PATCH v9 6/6] box: specify indexes in user-friendly form Kirill Shcherbatov
2019-02-04 15:30   ` Vladimir Davydov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox