Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH v8 0/5] box: Indexes by JSON path
@ 2019-01-16 13:44 Kirill Shcherbatov
  2019-01-16 13:44 ` [PATCH v8 1/5] lib: updated msgpuck library version Kirill Shcherbatov
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Kirill Shcherbatov @ 2019-01-16 13:44 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: kostja, Kirill Shcherbatov

Sometimes field data could have complex document structure.
When this structure is consistent across whole document,
you are able to create an index by JSON path.

Changes in version 8:
    - implemented mp_stack class and reworked mp_tuple_field_map_init
      with this new API
    - reworked "specify patch in user-friendly form" as LUA code
    - reworked tuple_field* routines stack to be a little more logical
    - path allocation on region in tuple_format1_can_store_format2_tuples
    - removed max path limit
    - better class members names
    - point code refactoring

v7:
https://www.freelists.org/post/tarantool-patches/PATCH-v7-55-box-specify-indexes-in-userfriendly-form,1
v6:
https://www.freelists.org/post/tarantool-patches/PATCH-v6-88-box-specify-indexes-in-userfriendly-form,1
v5:
https://www.freelists.org/post/tarantool-patches/PATCH-v5-99-box-specify-indexes-in-userfriendly-form,1

http://github.com/tarantool/tarantool/tree/kshch/gh-1012-json-indexes
https://github.com/tarantool/tarantool/issues/1012

Kirill Shcherbatov (5):
  lib: updated msgpuck library version
  box: refactor tuple_field_raw_by_path routine
  box: introduce JSON Indexes
  box: introduce offset_slot cache in key_part
  box: specify indexes in user-friendly form

 src/box/alter.cc             |   2 +-
 src/box/index_def.c          |  10 +-
 src/box/key_def.c            | 153 +++++++++--
 src/box/key_def.h            |  48 +++-
 src/box/lua/schema.lua       |  60 ++++-
 src/box/lua/space.cc         |   5 +
 src/box/lua/tuple.c          |   9 +-
 src/box/memtx_engine.c       |   4 +
 src/box/sql.c                |   1 +
 src/box/sql/build.c          |   1 +
 src/box/sql/select.c         |   3 +-
 src/box/sql/where.c          |   1 +
 src/box/tuple.h              |  51 ++--
 src/box/tuple_compare.cc     |   7 +-
 src/box/tuple_extract_key.cc |  30 ++-
 src/box/tuple_format.c       | 362 +++++++++++++++++--------
 src/box/tuple_format.h       | 128 +++++++--
 src/box/tuple_hash.cc        |   2 +-
 src/box/vinyl.c              |   4 +
 src/box/vy_log.c             |   8 +-
 src/box/vy_point_lookup.c    |   3 +-
 src/box/vy_stmt.c            |  74 ++++--
 src/lib/msgpuck              |   2 +-
 test/engine/json.result      | 494 +++++++++++++++++++++++++++++++++++
 test/engine/json.test.lua    | 143 ++++++++++
 test/engine/tuple.result     |  16 +-
 test/unit/msgpack.result     |  24 +-
 27 files changed, 1401 insertions(+), 244 deletions(-)
 create mode 100644 test/engine/json.result
 create mode 100644 test/engine/json.test.lua

-- 
2.19.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 1/5] lib: updated msgpuck library version
  2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
@ 2019-01-16 13:44 ` Kirill Shcherbatov
  2019-01-23 12:53   ` Vladimir Davydov
  2019-01-16 13:44 ` [PATCH v8 2/5] box: refactor tuple_field_raw_by_path routine Kirill Shcherbatov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Kirill Shcherbatov @ 2019-01-16 13:44 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: kostja, Kirill Shcherbatov

Needed for #1012
---
 src/lib/msgpuck          |  2 +-
 test/unit/msgpack.result | 24 +++++++++++++++++++++++-
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/lib/msgpuck b/src/lib/msgpuck
index 3b8f3e59b..9613f79e6 160000
--- a/src/lib/msgpuck
+++ b/src/lib/msgpuck
@@ -1 +1 @@
-Subproject commit 3b8f3e59b62d74f0198e01cbec0beb9c6a3082fb
+Subproject commit 9613f79e65d9838763422aae0bfb3ca9e2901a32
diff --git a/test/unit/msgpack.result b/test/unit/msgpack.result
index 12e8f8626..39c5b6e01 100644
--- a/test/unit/msgpack.result
+++ b/test/unit/msgpack.result
@@ -1,4 +1,4 @@
-1..20
+1..21
     1..135
     # *** test_uints ***
     # uint 0U
@@ -1824,3 +1824,25 @@ ok 19 - subtests
     ok 4 - mp_check bin overflow
     # *** test_overflow: done ***
 ok 20 - subtests
+    1..18
+    # *** test_mpframe ***
+    ok 1 - Stack is empty
+    ok 2 - Stack is not empty
+    ok 3 - Stack is not full
+    ok 4 - Correct field type
+    ok 5 - Correct items count
+    ok 6 - Not all items on level 1 parsed
+    ok 7 - Correct current item index
+    ok 8 - Correct field type
+    ok 9 - Correct items count
+    ok 10 - Stack is full
+    ok 11 - Not all items on level 2 parsed
+    ok 12 - Not all items on level 2 parsed
+    ok 13 - All items on level 2 parsed
+    ok 14 - Stack is not full
+    ok 15 - Not all items on level 1 parsed
+    ok 16 - Not all items on level 1 parsed
+    ok 17 - All items on level 1 parsed
+    ok 18 - Stack is empty
+    # *** test_mpframe: done ***
+ok 21 - subtests
-- 
2.19.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 2/5] box: refactor tuple_field_raw_by_path routine
  2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
  2019-01-16 13:44 ` [PATCH v8 1/5] lib: updated msgpuck library version Kirill Shcherbatov
@ 2019-01-16 13:44 ` Kirill Shcherbatov
  2019-01-23 13:15   ` Vladimir Davydov
  2019-01-16 13:44 ` [PATCH v8 3/5] box: introduce JSON Indexes Kirill Shcherbatov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Kirill Shcherbatov @ 2019-01-16 13:44 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: kostja, Kirill Shcherbatov

Refactored tuple_field_raw_by_path routine as function
tuple_field_raw_by_full_path returning field data if the field
exists or NULL, instead of error index in path.
The *_by_full_path is a better name for routine because we are
going to introduce *_by_path routine in future that would work
with relative paths. const char * returning valiue is more
common result type for such routines.
Got rid of invalid path error handling in routine
lbox_tuple_field_by_path.

Needed for #1012
---
 src/box/lua/tuple.c      |  9 ++----
 src/box/tuple.h          | 51 +++++++++++++++------------------
 src/box/tuple_format.c   | 61 +++++++++++-----------------------------
 src/box/tuple_format.h   | 53 ++++++++++++++++++----------------
 test/engine/tuple.result | 16 +++++------
 5 files changed, 79 insertions(+), 111 deletions(-)

diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 1867f810f..7d377b69e 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -463,13 +463,10 @@ lbox_tuple_field_by_path(struct lua_State *L)
 	const char *field = NULL, *path = lua_tolstring(L, 2, &len);
 	if (len == 0)
 		return 0;
-	if (tuple_field_by_path(tuple, path, (uint32_t) len,
-				lua_hashstring(L, 2), &field) != 0) {
-		return luaT_error(L);
-	} else if (field == NULL) {
+	field = tuple_field_by_full_path(tuple, path, (uint32_t)len,
+					 lua_hashstring(L, 2));
+	if (field == NULL)
 		return 0;
-	}
-	assert(field != NULL);
 	luamp_decode(L, luaL_msgpack_default, &field);
 	return 1;
 }
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 83e5b7013..4368dac4e 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -513,6 +513,23 @@ tuple_field(const struct tuple *tuple, uint32_t fieldno)
 			       tuple_field_map(tuple), fieldno);
 }
 
+/**
+ * Get a field refereed by index @part in tuple.
+ * @param tuple Tuple to get the field from.
+ * @param part Index part to use.
+ * @param path Relative JSON path to field data.
+ * @param path_len The length of the @path.
+ * @retval Field data if the field exists or NULL.
+ */
+static inline const char *
+tuple_field_by_path(const struct tuple *tuple, uint32_t fieldno,
+		    const char *path, uint32_t path_len)
+{
+	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
+				       tuple_field_map(tuple), fieldno,
+				       path, path_len);
+}
+
 /**
  * Get a field refereed by index @part in tuple.
  * @param tuple Tuple to get the field from.
@@ -532,38 +549,16 @@ tuple_field_by_part(const struct tuple *tuple, struct key_part *part)
  * @param path Field JSON path.
  * @param path_len Length of @a path.
  * @param path_hash Hash of @a path.
- * @param[out] field Found field, or NULL, if not found.
- *
- * @retval  0 Success.
- * @retval -1 Error in JSON path.
- */
-static inline int
-tuple_field_by_path(const struct tuple *tuple, const char *path,
-                    uint32_t path_len, uint32_t path_hash,
-                    const char **field)
-{
-	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
-	                               tuple_field_map(tuple), path, path_len,
-	                               path_hash, field);
-}
-
-/**
- * Get tuple field by its name.
- * @param tuple Tuple to get field from.
- * @param name Field name.
- * @param name_len Length of @a name.
- * @param name_hash Hash of @a name.
  *
- * @retval not NULL MessagePack field.
- * @retval     NULL No field with @a name.
+ * @retval field data if field exists or NULL
  */
 static inline const char *
-tuple_field_by_name(const struct tuple *tuple, const char *name,
-		    uint32_t name_len, uint32_t name_hash)
+tuple_field_by_full_path(const struct tuple *tuple, const char *path,
+			 uint32_t path_len, uint32_t path_hash)
 {
-	return tuple_field_raw_by_name(tuple_format(tuple), tuple_data(tuple),
-				       tuple_field_map(tuple), name, name_len,
-				       name_hash);
+	return tuple_field_raw_by_full_path(tuple_format(tuple), tuple_data(tuple),
+					    tuple_field_map(tuple),
+					    path, path_len, path_hash);
 }
 
 /**
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index e11b4e6f3..61b4c9734 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -713,15 +713,7 @@ tuple_field_go_to_key(const char **field, const char *key, int len)
 	return -1;
 }
 
-/**
- * Retrieve msgpack data by JSON path.
- * @param data Pointer to msgpack with data.
- * @param path The path to process.
- * @param path_len The length of the @path.
- * @retval 0 On success.
- * @retval >0 On path parsing error, invalid character position.
- */
-static int
+int
 tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len)
 {
 	int rc;
@@ -748,11 +740,10 @@ tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len)
 	return rc;
 }
 
-int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
-                        const uint32_t *field_map, const char *path,
-                        uint32_t path_len, uint32_t path_hash,
-                        const char **field)
+const char *
+tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
+			     const uint32_t *field_map, const char *path,
+			     uint32_t path_len, uint32_t path_hash)
 {
 	assert(path_len > 0);
 	uint32_t fieldno;
@@ -763,22 +754,16 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 	 * use the path as a field name.
 	 */
 	if (tuple_fieldno_by_name(format->dict, path, path_len, path_hash,
-				  &fieldno) == 0) {
-		*field = tuple_field_raw(format, tuple, field_map, fieldno);
-		return 0;
-	}
+				  &fieldno) == 0)
+		return tuple_field_raw(format, tuple, field_map, fieldno);
 	struct json_lexer lexer;
 	struct json_token token;
 	json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
-	int rc = json_lexer_next_token(&lexer, &token);
-	if (rc != 0)
-		goto error;
+	if (json_lexer_next_token(&lexer, &token) != 0)
+		return NULL;
 	switch(token.type) {
 	case JSON_TOKEN_NUM: {
-		int index = token.num;
-		*field = tuple_field_raw(format, tuple, field_map, index);
-		if (*field == NULL)
-			return 0;
+		fieldno = token.num;
 		break;
 	}
 	case JSON_TOKEN_STR: {
@@ -795,28 +780,16 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			 */
 			name_hash = field_name_hash(token.str, token.len);
 		}
-		*field = tuple_field_raw_by_name(format, tuple, field_map,
-						 token.str, token.len,
-						 name_hash);
-		if (*field == NULL)
-			return 0;
+		if (tuple_fieldno_by_name(format->dict, token.str, token.len,
+					  name_hash, &fieldno) != 0)
+			return NULL;
 		break;
 	}
 	default:
 		assert(token.type == JSON_TOKEN_END);
-		*field = NULL;
-		return 0;
+		return NULL;
 	}
-	rc = tuple_field_go_to_path(field, path + lexer.offset,
-				    path_len - lexer.offset);
-	if (rc == 0)
-		return 0;
-	/* Setup absolute error position. */
-	rc += lexer.offset;
-
-error:
-	assert(rc > 0);
-	diag_set(ClientError, ER_ILLEGAL_PARAMS,
-		 tt_sprintf("error in path on position %d", rc));
-	return -1;
+	return tuple_field_raw_by_path(format, tuple, field_map, fieldno,
+				       path + lexer.offset,
+				       path_len - lexer.offset);
 }
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index 30b93b610..d32bb91f2 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -397,27 +397,33 @@ tuple_field_raw(struct tuple_format *format, const char *tuple,
 }
 
 /**
- * Get tuple field by its name.
- * @param format Tuple format.
- * @param tuple MessagePack tuple's body.
- * @param field_map Tuple field map.
- * @param name Field name.
- * @param name_len Length of @a name.
- * @param name_hash Hash of @a name.
- *
- * @retval not NULL MessagePack field.
- * @retval     NULL No field with @a name.
+ * Retrieve msgpack data by JSON path.
+ * @param data Pointer to msgpack with data.
+ * @param path The path to process.
+ * @param path_len The length of the @path.
+ * @retval 0 On success.
+ * @retval >0 On path parsing error, invalid character position.
+ */
+int
+tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len);
+
+/**
+ * Get a field at the specific position in this MessagePack
+ * array by fieldno and path.
  */
 static inline const char *
-tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
-			const uint32_t *field_map, const char *name,
-			uint32_t name_len, uint32_t name_hash)
+tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
+			const uint32_t *field_map, uint32_t fieldno,
+			const char *path, uint32_t path_len)
 {
-	uint32_t fieldno;
-	if (tuple_fieldno_by_name(format->dict, name, name_len, name_hash,
-				  &fieldno) != 0)
+	tuple = tuple_field_raw(format, tuple, field_map, fieldno);
+	if (path == NULL)
+		return tuple;
+	if (unlikely(tuple == NULL))
 		return NULL;
-	return tuple_field_raw(format, tuple, field_map, fieldno);
+	if (unlikely(tuple_field_go_to_path(&tuple, path, path_len) != 0))
+		return NULL;
+	return tuple;
 }
 
 /**
@@ -428,16 +434,13 @@ tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
  * @param path Field path.
  * @param path_len Length of @a path.
  * @param path_hash Hash of @a path.
- * @param[out] field Found field, or NULL, if not found.
  *
- * @retval  0 Success.
- * @retval -1 Error in JSON path.
+ * @retval field data if field exists or NULL
  */
-int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
-                        const uint32_t *field_map, const char *path,
-                        uint32_t path_len, uint32_t path_hash,
-                        const char **field);
+const char *
+tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
+			     const uint32_t *field_map, const char *path,
+			     uint32_t path_len, uint32_t path_hash);
 
 /**
  * Get a tuple field pointed to by an index part.
diff --git a/test/engine/tuple.result b/test/engine/tuple.result
index 7ca3985c7..daf57d1f5 100644
--- a/test/engine/tuple.result
+++ b/test/engine/tuple.result
@@ -823,7 +823,7 @@ t[0]
 ...
 t["[0]"]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t["[1000]"]
 ---
@@ -847,7 +847,7 @@ t["[2][6].key100"]
 ...
 t["[2][0]"] -- 0-based index in array.
 ---
-- error: Illegal parameters, error in path on position 5
+- null
 ...
 t["[4][3]"] -- Can not index string.
 ---
@@ -866,27 +866,27 @@ t["a.b.c d.e.f"]
 -- Sytax errors.
 t["[2].[5]"]
 ---
-- error: Illegal parameters, error in path on position 5
+- null
 ...
 t["[-1]"]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t[".."]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t["[["]
 ---
-- error: Illegal parameters, error in path on position 2
+- null
 ...
 t["]]"]
 ---
-- error: Illegal parameters, error in path on position 1
+- null
 ...
 t["{"]
 ---
-- error: Illegal parameters, error in path on position 1
+- null
 ...
 s:drop()
 ---
-- 
2.19.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 3/5] box: introduce JSON Indexes
  2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
  2019-01-16 13:44 ` [PATCH v8 1/5] lib: updated msgpuck library version Kirill Shcherbatov
  2019-01-16 13:44 ` [PATCH v8 2/5] box: refactor tuple_field_raw_by_path routine Kirill Shcherbatov
@ 2019-01-16 13:44 ` Kirill Shcherbatov
  2019-01-23 13:49   ` Vladimir Davydov
  2019-01-16 13:44 ` [PATCH v8 4/5] box: introduce offset_slot cache in key_part Kirill Shcherbatov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Kirill Shcherbatov @ 2019-01-16 13:44 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: kostja, Kirill Shcherbatov

New JSON indexes allows to index documents content.
At first, introduced new key_part fields path and path_len
representing JSON path string specified by user. Modified
tuple_format_use_key_part routine constructs corresponding
tuple_fields chain in tuple_format::fields tree to indexed data.
The resulting tree is used for type checking and for alloctating
indexed fields offset slots.

Then refined tuple_init_field_map routine logic parses tuple
msgpack in depth using stack allocated on region and initialize
field map with corresponding tuple_format::field if any.
Finally, to proceed memory allocation for vinyl's secondary key
restored by extracted keys loaded from disc without fields
tree traversal, introduced format::min_tuple_size field - the
size of tuple_format tuple as if all leaf fields are zero.

Example:
To create a new JSON index specify path to document data as a
part of key_part:
parts = {{3, 'str', path = '.FIO.fname', is_nullable = false}}
idx = s:create_index('json_idx', {parts = parse})
idx:select("Ivanov")

Part of #1012
---
 src/box/alter.cc             |   2 +-
 src/box/index_def.c          |  10 +-
 src/box/key_def.c            | 148 ++++++++++--
 src/box/key_def.h            |  34 ++-
 src/box/lua/space.cc         |   5 +
 src/box/memtx_engine.c       |   4 +
 src/box/sql.c                |   1 +
 src/box/sql/build.c          |   1 +
 src/box/sql/select.c         |   3 +-
 src/box/sql/where.c          |   1 +
 src/box/tuple_compare.cc     |   7 +-
 src/box/tuple_extract_key.cc |  30 ++-
 src/box/tuple_format.c       | 299 ++++++++++++++++++-----
 src/box/tuple_format.h       |  45 +++-
 src/box/tuple_hash.cc        |   2 +-
 src/box/vinyl.c              |   4 +
 src/box/vy_log.c             |   8 +-
 src/box/vy_point_lookup.c    |   3 +-
 src/box/vy_stmt.c            |  74 ++++--
 test/engine/json.result      | 452 +++++++++++++++++++++++++++++++++++
 test/engine/json.test.lua    | 131 ++++++++++
 21 files changed, 1140 insertions(+), 124 deletions(-)
 create mode 100644 test/engine/json.result
 create mode 100644 test/engine/json.test.lua

diff --git a/src/box/alter.cc b/src/box/alter.cc
index 0589c9678..9656a4189 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -268,7 +268,7 @@ index_def_new_from_tuple(struct tuple *tuple, struct space *space)
 	});
 	if (key_def_decode_parts(part_def, part_count, &parts,
 				 space->def->fields,
-				 space->def->field_count) != 0)
+				 space->def->field_count, &fiber()->gc) != 0)
 		diag_raise();
 	key_def = key_def_new(part_def, part_count);
 	if (key_def == NULL)
diff --git a/src/box/index_def.c b/src/box/index_def.c
index 2ba57ee9d..58137ed07 100644
--- a/src/box/index_def.c
+++ b/src/box/index_def.c
@@ -31,6 +31,8 @@
 #include "index_def.h"
 #include "schema_def.h"
 #include "identifier.h"
+#include "tuple_format.h"
+#include "json/json.h"
 
 const char *index_type_strs[] = { "HASH", "TREE", "BITSET", "RTREE" };
 
@@ -278,8 +280,12 @@ index_def_is_valid(struct index_def *index_def, const char *space_name)
 			 * Courtesy to a user who could have made
 			 * a typo.
 			 */
-			if (index_def->key_def->parts[i].fieldno ==
-			    index_def->key_def->parts[j].fieldno) {
+			struct key_part *part_a = &index_def->key_def->parts[i];
+			struct key_part *part_b = &index_def->key_def->parts[j];
+			if (part_a->fieldno == part_b->fieldno &&
+			    json_path_cmp(part_a->path, part_a->path_len,
+					  part_b->path, part_b->path_len,
+					  TUPLE_INDEX_BASE) == 0){
 				diag_set(ClientError, ER_MODIFY_INDEX,
 					 index_def->name, space_name,
 					 "same key part is indexed twice");
diff --git a/src/box/key_def.c b/src/box/key_def.c
index dae3580e2..0ed497dfc 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -28,6 +28,7 @@
  * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
+#include "json/json.h"
 #include "key_def.h"
 #include "tuple_compare.h"
 #include "tuple_extract_key.h"
@@ -35,6 +36,7 @@
 #include "column_mask.h"
 #include "schema_def.h"
 #include "coll_id_cache.h"
+#include "small/region.h"
 
 const char *sort_order_strs[] = { "asc", "desc", "undef" };
 
@@ -44,7 +46,8 @@ const struct key_part_def key_part_def_default = {
 	COLL_NONE,
 	false,
 	ON_CONFLICT_ACTION_DEFAULT,
-	SORT_ORDER_ASC
+	SORT_ORDER_ASC,
+	NULL
 };
 
 static int64_t
@@ -59,6 +62,7 @@ part_type_by_name_wrapper(const char *str, uint32_t len)
 #define PART_OPT_NULLABILITY	 "is_nullable"
 #define PART_OPT_NULLABLE_ACTION "nullable_action"
 #define PART_OPT_SORT_ORDER	 "sort_order"
+#define PART_OPT_PATH		 "path"
 
 const struct opt_def part_def_reg[] = {
 	OPT_DEF_ENUM(PART_OPT_TYPE, field_type, struct key_part_def, type,
@@ -71,19 +75,30 @@ const struct opt_def part_def_reg[] = {
 		     struct key_part_def, nullable_action, NULL),
 	OPT_DEF_ENUM(PART_OPT_SORT_ORDER, sort_order, struct key_part_def,
 		     sort_order, NULL),
+	OPT_DEF(PART_OPT_PATH, OPT_STRPTR, struct key_part_def, path),
 	OPT_END,
 };
 
 struct key_def *
 key_def_dup(const struct key_def *src)
 {
-	size_t sz = key_def_sizeof(src->part_count);
+	size_t sz = 0;
+	for (uint32_t i = 0; i < src->part_count; i++)
+		sz += src->parts[i].path_len;
+	sz = key_def_sizeof(src->part_count, sz);
 	struct key_def *res = (struct key_def *)malloc(sz);
 	if (res == NULL) {
 		diag_set(OutOfMemory, sz, "malloc", "res");
 		return NULL;
 	}
 	memcpy(res, src, sz);
+	/* Update paths to point to the new memory chunk.*/
+	for (uint32_t i = 0; i < src->part_count; i++) {
+		if (src->parts[i].path == NULL)
+			continue;
+		size_t path_offset = src->parts[i].path - (char *)src;
+		res->parts[i].path = (char *)res + path_offset;
+	}
 	return res;
 }
 
@@ -91,8 +106,16 @@ void
 key_def_swap(struct key_def *old_def, struct key_def *new_def)
 {
 	assert(old_def->part_count == new_def->part_count);
-	for (uint32_t i = 0; i < new_def->part_count; i++)
+	for (uint32_t i = 0; i < new_def->part_count; i++) {
 		SWAP(old_def->parts[i], new_def->parts[i]);
+		/*
+		 * Paths are allocated as a part of key_def so
+		 * we need to swap path pointers back - it's OK
+		 * as paths aren't supposed to change.
+		 */
+		assert(old_def->parts[i].path_len == new_def->parts[i].path_len);
+		SWAP(old_def->parts[i].path, new_def->parts[i].path);
+	}
 	SWAP(*old_def, *new_def);
 }
 
@@ -115,24 +138,39 @@ static void
 key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 		 enum field_type type, enum on_conflict_action nullable_action,
 		 struct coll *coll, uint32_t coll_id,
-		 enum sort_order sort_order)
+		 enum sort_order sort_order, const char *path,
+		 uint32_t path_len, char **path_pool)
 {
 	assert(part_no < def->part_count);
 	assert(type < field_type_MAX);
 	def->is_nullable |= (nullable_action == ON_CONFLICT_ACTION_NONE);
+	def->has_json_paths |= path != NULL;
 	def->parts[part_no].nullable_action = nullable_action;
 	def->parts[part_no].fieldno = fieldno;
 	def->parts[part_no].type = type;
 	def->parts[part_no].coll = coll;
 	def->parts[part_no].coll_id = coll_id;
 	def->parts[part_no].sort_order = sort_order;
+	if (path != NULL) {
+		assert(path_pool != NULL);
+		def->parts[part_no].path = *path_pool;
+		*path_pool += path_len;
+		memcpy(def->parts[part_no].path, path, path_len);
+		def->parts[part_no].path_len = path_len;
+	} else {
+		def->parts[part_no].path = NULL;
+		def->parts[part_no].path_len = 0;
+	}
 	column_mask_set_fieldno(&def->column_mask, fieldno);
 }
 
 struct key_def *
 key_def_new(const struct key_part_def *parts, uint32_t part_count)
 {
-	size_t sz = key_def_sizeof(part_count);
+	size_t sz = 0;
+	for (uint32_t i = 0; i < part_count; i++)
+		sz += parts[i].path != NULL ? strlen(parts[i].path) : 0;
+	sz = key_def_sizeof(part_count, sz);
 	struct key_def *def = calloc(1, sz);
 	if (def == NULL) {
 		diag_set(OutOfMemory, sz, "malloc", "struct key_def");
@@ -142,6 +180,8 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count)
 	def->part_count = part_count;
 	def->unique_part_count = part_count;
 
+	/* Paths data in key_def chunk. */
+	char *path_pool = (char *)def + key_def_sizeof(part_count, 0);
 	for (uint32_t i = 0; i < part_count; i++) {
 		const struct key_part_def *part = &parts[i];
 		struct coll *coll = NULL;
@@ -155,16 +195,19 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count)
 			}
 			coll = coll_id->coll;
 		}
+		uint32_t path_len = part->path != NULL ? strlen(part->path) : 0;
 		key_def_set_part(def, i, part->fieldno, part->type,
 				 part->nullable_action, coll, part->coll_id,
-				 part->sort_order);
+				 part->sort_order, part->path, path_len,
+				 &path_pool);
 	}
 	key_def_set_cmp(def);
 	return def;
 }
 
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
+int
+key_def_dump_parts(const struct key_def *def, struct key_part_def *parts,
+		   struct region *region)
 {
 	for (uint32_t i = 0; i < def->part_count; i++) {
 		const struct key_part *part = &def->parts[i];
@@ -174,13 +217,27 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
 		part_def->is_nullable = key_part_is_nullable(part);
 		part_def->nullable_action = part->nullable_action;
 		part_def->coll_id = part->coll_id;
+		if (part->path != NULL) {
+			char *path = region_alloc(region, part->path_len + 1);
+			if (path == NULL) {
+				diag_set(OutOfMemory, part->path_len + 1,
+					 "region", "part_def->path");
+				return -1;
+			}
+			memcpy(path, part->path, part->path_len);
+			path[part->path_len] = '\0';
+			part_def->path = path;
+		} else {
+			part_def->path = NULL;
+		}
 	}
+	return 0;
 }
 
 box_key_def_t *
 box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 {
-	size_t sz = key_def_sizeof(part_count);
+	size_t sz = key_def_sizeof(part_count, 0);
 	struct key_def *key_def = calloc(1, sz);
 	if (key_def == NULL) {
 		diag_set(OutOfMemory, sz, "malloc", "struct key_def");
@@ -194,7 +251,8 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 		key_def_set_part(key_def, item, fields[item],
 				 (enum field_type)types[item],
 				 ON_CONFLICT_ACTION_DEFAULT,
-				 NULL, COLL_NONE, SORT_ORDER_ASC);
+				 NULL, COLL_NONE, SORT_ORDER_ASC, NULL, 0,
+				 NULL);
 	}
 	key_def_set_cmp(key_def);
 	return key_def;
@@ -243,6 +301,11 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
 		if (key_part_is_nullable(part1) != key_part_is_nullable(part2))
 			return key_part_is_nullable(part1) <
 			       key_part_is_nullable(part2) ? -1 : 1;
+		int rc = json_path_cmp(part1->path, part1->path_len,
+				       part2->path, part2->path_len,
+				       TUPLE_INDEX_BASE);
+		if (rc != 0)
+			return rc;
 	}
 	return part_count1 < part_count2 ? -1 : part_count1 > part_count2;
 }
@@ -274,8 +337,13 @@ key_def_snprint_parts(char *buf, int size, const struct key_part_def *parts,
 	for (uint32_t i = 0; i < part_count; i++) {
 		const struct key_part_def *part = &parts[i];
 		assert(part->type < field_type_MAX);
-		SNPRINT(total, snprintf, buf, size, "%d, '%s'",
+		SNPRINT(total, snprintf, buf, size, "[%d, '%s'",
 			(int)part->fieldno, field_type_strs[part->type]);
+		if (part->path != NULL) {
+			SNPRINT(total, snprintf, buf, size, ", path='%s'",
+				part->path);
+		}
+		SNPRINT(total, snprintf, buf, size, "]");
 		if (i < part_count - 1)
 			SNPRINT(total, snprintf, buf, size, ", ");
 	}
@@ -294,6 +362,8 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
 			count++;
 		if (part->is_nullable)
 			count++;
+		if (part->path != NULL)
+			count++;
 		size += mp_sizeof_map(count);
 		size += mp_sizeof_str(strlen(PART_OPT_FIELD));
 		size += mp_sizeof_uint(part->fieldno);
@@ -308,6 +378,10 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
 			size += mp_sizeof_str(strlen(PART_OPT_NULLABILITY));
 			size += mp_sizeof_bool(part->is_nullable);
 		}
+		if (part->path != NULL) {
+			size += mp_sizeof_str(strlen(PART_OPT_PATH));
+			size += mp_sizeof_str(strlen(part->path));
+		}
 	}
 	return size;
 }
@@ -323,6 +397,8 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
 			count++;
 		if (part->is_nullable)
 			count++;
+		if (part->path != NULL)
+			count++;
 		data = mp_encode_map(data, count);
 		data = mp_encode_str(data, PART_OPT_FIELD,
 				     strlen(PART_OPT_FIELD));
@@ -342,6 +418,12 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
 					     strlen(PART_OPT_NULLABILITY));
 			data = mp_encode_bool(data, part->is_nullable);
 		}
+		if (part->path != NULL) {
+			data = mp_encode_str(data, PART_OPT_PATH,
+					     strlen(PART_OPT_PATH));
+			data = mp_encode_str(data, part->path,
+					     strlen(part->path));
+		}
 	}
 	return data;
 }
@@ -403,6 +485,7 @@ key_def_decode_parts_166(struct key_part_def *parts, uint32_t part_count,
 				     fields[part->fieldno].is_nullable :
 				     key_part_def_default.is_nullable);
 		part->coll_id = COLL_NONE;
+		part->path = NULL;
 	}
 	return 0;
 }
@@ -410,7 +493,7 @@ key_def_decode_parts_166(struct key_part_def *parts, uint32_t part_count,
 int
 key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 		     const char **data, const struct field_def *fields,
-		     uint32_t field_count)
+		     uint32_t field_count, struct region *region)
 {
 	if (mp_typeof(**data) == MP_ARRAY) {
 		return key_def_decode_parts_166(parts, part_count, data,
@@ -439,7 +522,7 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 			const char *key = mp_decode_str(data, &key_len);
 			if (opts_parse_key(part, part_def_reg, key, key_len, data,
 					   ER_WRONG_INDEX_OPTIONS,
-					   i + TUPLE_INDEX_BASE, NULL,
+					   i + TUPLE_INDEX_BASE, region,
 					   false) != 0)
 				return -1;
 			if (is_action_missing &&
@@ -485,6 +568,14 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 				 "index part: unknown sort order");
 			return -1;
 		}
+		if (part->path != NULL &&
+		    json_path_validate(part->path, strlen(part->path),
+				       TUPLE_INDEX_BASE) != 0) {
+			diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+				 part->fieldno + TUPLE_INDEX_BASE,
+				 "invalid path");
+			return -1;
+		}
 	}
 	return 0;
 }
@@ -504,7 +595,10 @@ key_def_find(const struct key_def *key_def, const struct key_part *to_find)
 	const struct key_part *part = key_def->parts;
 	const struct key_part *end = part + key_def->part_count;
 	for (; part != end; part++) {
-		if (part->fieldno == to_find->fieldno)
+		if (part->fieldno == to_find->fieldno &&
+		    json_path_cmp(part->path, part->path_len,
+				  to_find->path, to_find->path_len,
+				  TUPLE_INDEX_BASE) == 0)
 			return part;
 	}
 	return NULL;
@@ -530,18 +624,25 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	 * Find and remove part duplicates, i.e. parts counted
 	 * twice since they are present in both key defs.
 	 */
-	const struct key_part *part = second->parts;
-	const struct key_part *end = part + second->part_count;
+	size_t sz = 0;
+	const struct key_part *part = first->parts;
+	const struct key_part *end = part + first->part_count;
+	for (; part != end; part++)
+		sz += part->path_len;
+	part = second->parts;
+	end = part + second->part_count;
 	for (; part != end; part++) {
 		if (key_def_find(first, part) != NULL)
 			--new_part_count;
+		else
+			sz += part->path_len;
 	}
 
+	sz = key_def_sizeof(new_part_count, sz);
 	struct key_def *new_def;
-	new_def =  (struct key_def *)calloc(1, key_def_sizeof(new_part_count));
+	new_def = (struct key_def *)calloc(1, sz);
 	if (new_def == NULL) {
-		diag_set(OutOfMemory, key_def_sizeof(new_part_count), "malloc",
-			 "new_def");
+		diag_set(OutOfMemory, sz, "malloc", "new_def");
 		return NULL;
 	}
 	new_def->part_count = new_part_count;
@@ -549,6 +650,9 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	new_def->is_nullable = first->is_nullable || second->is_nullable;
 	new_def->has_optional_parts = first->has_optional_parts ||
 				      second->has_optional_parts;
+
+	/* Paths data in the new key_def chunk. */
+	char *path_pool = (char *)new_def + key_def_sizeof(new_part_count, 0);
 	/* Write position in the new key def. */
 	uint32_t pos = 0;
 	/* Append first key def's parts to the new index_def. */
@@ -557,7 +661,8 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 	for (; part != end; part++) {
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
-				 part->coll_id, part->sort_order);
+				 part->coll_id, part->sort_order, part->path,
+				 part->path_len, &path_pool);
 	}
 
 	/* Set-append second key def's part to the new key def. */
@@ -568,7 +673,8 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 			continue;
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
-				 part->coll_id, part->sort_order);
+				 part->coll_id, part->sort_order, part->path,
+				 part->path_len, &path_pool);
 	}
 	key_def_set_cmp(new_def);
 	return new_def;
diff --git a/src/box/key_def.h b/src/box/key_def.h
index d1866303b..fe4acffb5 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -64,6 +64,12 @@ struct key_part_def {
 	enum on_conflict_action nullable_action;
 	/** Part sort order. */
 	enum sort_order sort_order;
+	/**
+	 * JSON path to indexed data, relative to the field number,
+	 * or NULL if this key part indexes a top-level field.
+	 * This sting is 0-terminated.
+	 */
+	const char *path;
 };
 
 extern const struct key_part_def key_part_def_default;
@@ -82,6 +88,15 @@ struct key_part {
 	enum on_conflict_action nullable_action;
 	/** Part sort order. */
 	enum sort_order sort_order;
+	/**
+	 * JSON path to indexed data, relative to the field number,
+	 * or NULL if this key part index a top-level field.
+	 * This string is not 0-terminated. String memory is
+	 * allocated at the end of key_def chunk.
+	 */
+	char *path;
+	/** The length of JSON path. */
+	uint32_t path_len;
 };
 
 struct key_def;
@@ -148,6 +163,8 @@ struct key_def {
 	uint32_t unique_part_count;
 	/** True, if at least one part can store NULL. */
 	bool is_nullable;
+	/** True if some key part has JSON path. */
+	bool has_json_paths;
 	/**
 	 * True, if some key parts can be absent in a tuple. These
 	 * fields assumed to be MP_NIL.
@@ -241,9 +258,10 @@ box_tuple_compare_with_key(const box_tuple_t *tuple_a, const char *key_b,
 /** \endcond public */
 
 static inline size_t
-key_def_sizeof(uint32_t part_count)
+key_def_sizeof(uint32_t part_count, uint32_t path_pool_size)
 {
-	return sizeof(struct key_def) + sizeof(struct key_part) * part_count;
+	return sizeof(struct key_def) + sizeof(struct key_part) * part_count +
+	       path_pool_size;
 }
 
 /**
@@ -255,9 +273,12 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count);
 
 /**
  * Dump part definitions of the given key def.
+ * The region is used for allocating JSON paths, if any.
+ * Return -1 on memory allocation error, 0 on success.
  */
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
+int
+key_def_dump_parts(const struct key_def *def, struct key_part_def *parts,
+		   struct region *region);
 
 /**
  * Update 'has_optional_parts' of @a key_def with correspondence
@@ -299,11 +320,12 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
  *  [NUM, STR, ..][NUM, STR, ..]..,
  *  OR
  *  {field=NUM, type=STR, ..}{field=NUM, type=STR, ..}..,
+ *  The region is used for allocating JSON paths, if any.
  */
 int
 key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
 		     const char **data, const struct field_def *fields,
-		     uint32_t field_count);
+		     uint32_t field_count, struct region *region);
 
 /**
  * Returns the part in index_def->parts for the specified fieldno.
@@ -364,6 +386,8 @@ key_validate_parts(const struct key_def *key_def, const char *key,
 static inline bool
 key_def_is_sequential(const struct key_def *key_def)
 {
+	if (key_def->has_json_paths)
+		return false;
 	for (uint32_t part_id = 0; part_id < key_def->part_count; part_id++) {
 		if (key_def->parts[part_id].fieldno != part_id)
 			return false;
diff --git a/src/box/lua/space.cc b/src/box/lua/space.cc
index 7cae436f1..1f152917e 100644
--- a/src/box/lua/space.cc
+++ b/src/box/lua/space.cc
@@ -296,6 +296,11 @@ lbox_fillspace(struct lua_State *L, struct space *space, int i)
 			lua_pushnumber(L, part->fieldno + TUPLE_INDEX_BASE);
 			lua_setfield(L, -2, "fieldno");
 
+			if (part->path != NULL) {
+				lua_pushlstring(L, part->path, part->path_len);
+				lua_setfield(L, -2, "path");
+			}
+
 			lua_pushboolean(L, key_part_is_nullable(part));
 			lua_setfield(L, -2, "is_nullable");
 
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 5cf70ab94..2cae791e1 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -1317,6 +1317,10 @@ memtx_index_def_change_requires_rebuild(struct index *index,
 			return true;
 		if (old_part->coll != new_part->coll)
 			return true;
+		if (json_path_cmp(old_part->path, old_part->path_len,
+				  new_part->path, new_part->path_len,
+				  TUPLE_INDEX_BASE) != 0)
+			return true;
 	}
 	return false;
 }
diff --git a/src/box/sql.c b/src/box/sql.c
index 081a038f1..c955429dd 100644
--- a/src/box/sql.c
+++ b/src/box/sql.c
@@ -386,6 +386,7 @@ sql_ephemeral_space_create(uint32_t field_count, struct sql_key_info *key_info)
 		part->nullable_action = ON_CONFLICT_ACTION_NONE;
 		part->is_nullable = true;
 		part->sort_order = SORT_ORDER_ASC;
+		part->path = NULL;
 		if (def != NULL && i < def->part_count)
 			part->coll_id = def->parts[i].coll_id;
 		else
diff --git a/src/box/sql/build.c b/src/box/sql/build.c
index 49b90b5d0..947daf8f6 100644
--- a/src/box/sql/build.c
+++ b/src/box/sql/build.c
@@ -2185,6 +2185,7 @@ index_fill_def(struct Parse *parse, struct index *index,
 		part->is_nullable = part->nullable_action == ON_CONFLICT_ACTION_NONE;
 		part->sort_order = SORT_ORDER_ASC;
 		part->coll_id = coll_id;
+		part->path = NULL;
 	}
 	key_def = key_def_new(key_parts, expr_list->nExpr);
 	if (key_def == NULL)
diff --git a/src/box/sql/select.c b/src/box/sql/select.c
index 02ee225f1..3f136a342 100644
--- a/src/box/sql/select.c
+++ b/src/box/sql/select.c
@@ -1360,6 +1360,7 @@ sql_key_info_new(sqlite3 *db, uint32_t part_count)
 		part->is_nullable = false;
 		part->nullable_action = ON_CONFLICT_ACTION_ABORT;
 		part->sort_order = SORT_ORDER_ASC;
+		part->path = NULL;
 	}
 	return key_info;
 }
@@ -1377,7 +1378,7 @@ sql_key_info_new_from_key_def(sqlite3 *db, const struct key_def *key_def)
 	key_info->key_def = NULL;
 	key_info->refs = 1;
 	key_info->part_count = key_def->part_count;
-	key_def_dump_parts(key_def, key_info->parts);
+	key_def_dump_parts(key_def, key_info->parts, NULL);
 	return key_info;
 }
 
diff --git a/src/box/sql/where.c b/src/box/sql/where.c
index 571b5af78..814bd3926 100644
--- a/src/box/sql/where.c
+++ b/src/box/sql/where.c
@@ -2807,6 +2807,7 @@ whereLoopAddBtree(WhereLoopBuilder * pBuilder,	/* WHERE clause information */
 		part.is_nullable = false;
 		part.sort_order = SORT_ORDER_ASC;
 		part.coll_id = COLL_NONE;
+		part.path = NULL;
 
 		struct key_def *key_def = key_def_new(&part, 1);
 		if (key_def == NULL) {
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 3fe4cae32..7ab6e3bf6 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -469,7 +469,8 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	struct key_part *part = key_def->parts;
 	const char *tuple_a_raw = tuple_data(tuple_a);
 	const char *tuple_b_raw = tuple_data(tuple_b);
-	if (key_def->part_count == 1 && part->fieldno == 0) {
+	if (key_def->part_count == 1 && part->fieldno == 0 &&
+	    part->path == NULL) {
 		/*
 		 * First field can not be optional - empty tuples
 		 * can not exist.
@@ -1027,7 +1028,7 @@ tuple_compare_create(const struct key_def *def)
 		}
 	}
 	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def)) {
+	if (!key_def_has_collation(def) && !def->has_json_paths) {
 		/* Precalculated comparators don't use collation */
 		for (uint32_t k = 0;
 		     k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
@@ -1247,7 +1248,7 @@ tuple_compare_with_key_create(const struct key_def *def)
 		}
 	}
 	assert(! def->has_optional_parts);
-	if (!key_def_has_collation(def)) {
+	if (!key_def_has_collation(def) && !def->has_json_paths) {
 		/* Precalculated comparators don't use collation */
 		for (uint32_t k = 0;
 		     k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index ac8b5a44e..3efea9e7f 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -8,9 +8,11 @@ enum { MSGPACK_NULL = 0xc0 };
 static inline bool
 key_def_parts_are_sequential(const struct key_def *def, int i)
 {
-	uint32_t fieldno1 = def->parts[i].fieldno + 1;
-	uint32_t fieldno2 = def->parts[i + 1].fieldno;
-	return fieldno1 == fieldno2;
+	const struct key_part *part1 = &def->parts[i];
+	const struct key_part *part2 = &def->parts[i + 1];
+	return part1->fieldno + 1 == part2->fieldno &&
+	       part1->path == NULL && part2->path == NULL;
+
 }
 
 /** True, if a key con contain two or more parts in sequence. */
@@ -241,7 +243,8 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 			if (!key_def_parts_are_sequential(key_def, i))
 				break;
 		}
-		uint32_t end_fieldno = key_def->parts[i].fieldno;
+		const struct key_part *part = &key_def->parts[i];
+		uint32_t end_fieldno = part->fieldno;
 
 		if (fieldno < current_fieldno) {
 			/* Rewind. */
@@ -283,8 +286,23 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 				current_fieldno++;
 			}
 		}
-		memcpy(key_buf, field, field_end - field);
-		key_buf += field_end - field;
+		const char *src = field;
+		const char *src_end = field_end;
+		if (part->path != NULL) {
+			if (tuple_field_go_to_path(&src, part->path,
+						   part->path_len) != 0) {
+				/*
+				 * All tuples must be valid as all
+				 * integrity checks have already
+				 * passed.
+				 */
+				unreachable();
+			}
+			src_end = src;
+			mp_next(&src_end);
+		}
+		memcpy(key_buf, src, src_end - src);
+		key_buf += src_end - src;
 		if (has_optional_parts && null_count != 0) {
 			memset(key_buf, MSGPACK_NULL, null_count);
 			key_buf += null_count * mp_sizeof_nil();
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 61b4c9734..9af9c307a 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -69,9 +69,86 @@ static const char *
 tuple_field_path(const struct tuple_field *field)
 {
 	assert(field->token.parent != NULL);
-	assert(field->token.parent->parent == NULL);
-	assert(field->token.type == JSON_TOKEN_NUM);
-	return int2str(field->token.num + TUPLE_INDEX_BASE);
+	if (field->token.parent->parent == NULL) {
+		/* Top-level field, no need to format the path. */
+		return int2str(field->token.num + TUPLE_INDEX_BASE);
+	}
+	char *path = tt_static_buf();
+	json_tree_snprint_path(path, TT_STATIC_BUF_LEN, &field->token,
+			       TUPLE_INDEX_BASE);
+	return path;
+}
+
+/**
+ * Given a field number and a path, add the corresponding tuple
+ * field to the tuple format, allocating intermediate fields if
+ * necessary.
+ * Return a pointer to the leaf field on success, NULL on memory
+ * allocation error or type/nullability mistmatch error, diag
+ * message is set.
+ */
+static struct tuple_field *
+tuple_format_add_field(struct tuple_format *format, uint32_t fieldno,
+		       const char *path, uint32_t path_len)
+{
+	struct tuple_field *field = NULL;
+	struct tuple_field *parent = tuple_format_field(format, fieldno);
+	if (path_len == 0)
+		goto end;
+	field = tuple_field_new();
+	if (field == NULL)
+		goto fail;
+
+	int rc = 0;
+	uint32_t token_count = 0;
+	struct json_tree *tree = &format->fields;
+	struct json_lexer lexer;
+	json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
+	while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 &&
+	       field->token.type != JSON_TOKEN_END) {
+		enum field_type expected_type =
+			field->token.type == JSON_TOKEN_STR ?
+			FIELD_TYPE_MAP : FIELD_TYPE_ARRAY;
+		if (field_type1_contains_type2(parent->type, expected_type)) {
+			parent->type = expected_type;
+		} else {
+			diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
+				 tuple_field_path(parent),
+				 field_type_strs[parent->type],
+				 field_type_strs[expected_type]);
+			goto fail;
+		}
+		struct tuple_field *next =
+			json_tree_lookup_entry(tree, &parent->token,
+					       &field->token,
+					       struct tuple_field, token);
+		if (next == NULL) {
+			field->id = format->total_field_count++;
+			rc = json_tree_add(tree, &parent->token, &field->token);
+			if (rc != 0) {
+				diag_set(OutOfMemory, sizeof(struct json_token),
+					 "json_tree_add", "tree");
+				goto fail;
+			}
+			next = field;
+			field = tuple_field_new();
+			if (field == NULL)
+				goto fail;
+		}
+		parent = next;
+		token_count++;
+	}
+	/* Path has been verified by key_def_decode_parts. */
+	assert(rc == 0 && field->token.type == JSON_TOKEN_END);
+	assert(parent != NULL);
+	/* Update tree depth information. */
+	format->fields_depth = MAX(format->fields_depth, token_count + 1);
+end:
+	tuple_field_delete(field);
+	return parent;
+fail:
+	parent = NULL;
+	goto end;
 }
 
 /**
@@ -95,15 +172,26 @@ tuple_format_field_by_id(struct tuple_format *format, uint32_t id)
 static int
 tuple_format_use_key_part(struct tuple_format *format, uint32_t field_count,
 			  const struct key_part *part, bool is_sequential,
-			  int *current_slot)
+			  int *current_slot, char **path_pool)
 {
 	assert(part->fieldno < tuple_format_field_count(format));
-	struct tuple_field *field = tuple_format_field(format, part->fieldno);
+	const char *path = *path_pool;
+	if (part->path != NULL) {
+		/**
+		 * Copy JSON path data to reserved area at the
+		 * end of format allocation.
+		 */
+		memcpy(*path_pool, part->path, part->path_len);
+		*path_pool += part->path_len;
+	}
+	struct tuple_field *field = tuple_format_add_field(format, part->fieldno,
+							   path, part->path_len);
+	if (field == NULL)
+		return -1;
 	/*
-		* If a field is not present in the space format,
-		* inherit nullable action of the first key part
-		* referencing it.
-		*/
+	 * If a field is not present in the space format, inherit
+	 * nullable action of the first key part referencing it.
+	 */
 	if (part->fieldno >= field_count && !field->is_key_part)
 		field->nullable_action = part->nullable_action;
 	/*
@@ -158,7 +246,8 @@ tuple_format_use_key_part(struct tuple_format *format, uint32_t field_count,
 	 * simply accessible, so we don't store an offset for it.
 	 */
 	if (field->offset_slot == TUPLE_OFFSET_SLOT_NIL &&
-	    is_sequential == false && part->fieldno > 0) {
+	    is_sequential == false &&
+	    (part->fieldno > 0 || part->path != NULL)) {
 		*current_slot = *current_slot - 1;
 		field->offset_slot = *current_slot;
 	}
@@ -203,6 +292,11 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 
 	int current_slot = 0;
 
+	/*
+	 * Set pointer to reserved area in the format chunk
+	 * allocated with tuple_format_alloc call.
+	 */
+	char *path_pool = (char *)format + sizeof(struct tuple_format);
 	/* extract field type info */
 	for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
 		const struct key_def *key_def = keys[key_no];
@@ -213,7 +307,8 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 		for (; part < parts_end; part++) {
 			if (tuple_format_use_key_part(format, field_count, part,
 						      is_sequential,
-						      &current_slot) != 0)
+						      &current_slot,
+						      &path_pool) != 0)
 				return -1;
 		}
 	}
@@ -236,6 +331,8 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 			 "malloc", "required field bitmap");
 		return -1;
 	}
+	format->min_tuple_size =
+		mp_sizeof_array(tuple_format_field_count(format));
 	struct tuple_field *field;
 	json_tree_foreach_entry_preorder(field, &format->fields.root,
 					 struct tuple_field, token) {
@@ -247,6 +344,34 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
 		if (json_token_is_leaf(&field->token) &&
 		    !tuple_field_is_nullable(field))
 			bit_set(format->required_fields, field->id);
+
+		/* Update format::min_tuple_size by field. */
+		if (field->token.type == JSON_TOKEN_NUM) {
+			/*
+			 * Account a gap between omitted array
+			 * items.
+			 */
+			struct json_token **neighbors =
+				field->token.parent->children;
+			for (int i = field->token.num - 1;
+			     i > 0 && neighbors[i] == NULL; i--)
+				format->min_tuple_size += mp_sizeof_nil();
+		} else {
+			/* Account a key string for map member. */
+			assert(field->token.type == JSON_TOKEN_STR);
+			format->min_tuple_size +=
+				mp_sizeof_str(field->token.len);
+		}
+		int max_child_idx = field->token.max_child_idx;
+		if (json_token_is_leaf(&field->token)) {
+			format->min_tuple_size += mp_sizeof_nil();
+		} else if (field->type == FIELD_TYPE_ARRAY) {
+			format->min_tuple_size +=
+				mp_sizeof_array(max_child_idx + 1);
+		} else if (field->type == FIELD_TYPE_MAP) {
+			format->min_tuple_size +=
+				mp_sizeof_map(max_child_idx + 1);
+		}
 	}
 	return 0;
 }
@@ -317,6 +442,8 @@ static struct tuple_format *
 tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 		   uint32_t space_field_count, struct tuple_dictionary *dict)
 {
+	/* Size of area to store JSON paths data. */
+	uint32_t path_pool_size = 0;
 	uint32_t index_field_count = 0;
 	/* find max max field no */
 	for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
@@ -326,13 +453,15 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 		for (; part < pend; part++) {
 			index_field_count = MAX(index_field_count,
 						part->fieldno + 1);
+			path_pool_size += part->path_len;
 		}
 	}
 	uint32_t field_count = MAX(space_field_count, index_field_count);
 
-	struct tuple_format *format = malloc(sizeof(struct tuple_format));
+	uint32_t allocation_size = sizeof(struct tuple_format) + path_pool_size;
+	struct tuple_format *format = malloc(allocation_size);
 	if (format == NULL) {
-		diag_set(OutOfMemory, sizeof(struct tuple_format), "malloc",
+		diag_set(OutOfMemory, allocation_size, "malloc",
 			 "tuple format");
 		return NULL;
 	}
@@ -368,6 +497,8 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
 	}
 	format->total_field_count = field_count;
 	format->required_fields = NULL;
+	format->fields_depth = 1;
+	format->min_tuple_size = 0;
 	format->refs = 0;
 	format->id = FORMAT_ID_NIL;
 	format->index_field_count = index_field_count;
@@ -428,15 +559,29 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1,
 {
 	if (format1->exact_field_count != format2->exact_field_count)
 		return false;
-	uint32_t format1_field_count = tuple_format_field_count(format1);
-	uint32_t format2_field_count = tuple_format_field_count(format2);
-	for (uint32_t i = 0; i < format1_field_count; ++i) {
-		struct tuple_field *field1 = tuple_format_field(format1, i);
+	struct region *region = &fiber()->gc;
+	size_t region_svp = region_used(region);
+	bool ret = false;
+	struct tuple_field *field1;
+	json_tree_foreach_entry_preorder(field1, &format1->fields.root,
+					 struct tuple_field, token) {
+		region_truncate(region, region_svp);
+		uint32_t path_len = json_tree_snprint_path(NULL, 0,
+					&field1->token, TUPLE_INDEX_BASE);
+		char *path = region_alloc(region, path_len + 1);
+		if (path == NULL)
+			panic("Can not allocate memory for path");
+		json_tree_snprint_path(path, path_len + 1, &field1->token,
+				       TUPLE_INDEX_BASE);
+		struct tuple_field *field2 =
+			json_tree_lookup_path_entry(&format2->fields,
+				&format2->fields.root, path, path_len,
+				TUPLE_INDEX_BASE, struct tuple_field, token);
 		/*
 		 * The field has a data type in format1, but has
 		 * no data type in format2.
 		 */
-		if (i >= format2_field_count) {
+		if (field2 == NULL) {
 			/*
 			 * The field can get a name added
 			 * for it, and this doesn't require a data
@@ -450,20 +595,22 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1,
 			    tuple_field_is_nullable(field1))
 				continue;
 			else
-				return false;
+				goto out;
 		}
-		struct tuple_field *field2 = tuple_format_field(format2, i);
 		if (! field_type1_contains_type2(field1->type, field2->type))
-			return false;
+			goto out;
 		/*
 		 * Do not allow transition from nullable to non-nullable:
 		 * it would require a check of all data in the space.
 		 */
 		if (tuple_field_is_nullable(field2) &&
 		    !tuple_field_is_nullable(field1))
-			return false;
+			goto out;
 	}
-	return true;
+	ret = true;
+out:
+	region_truncate(region, region_svp);
+	return ret;
 }
 
 /** @sa declaration for details. */
@@ -512,49 +659,79 @@ tuple_init_field_map(struct tuple_format *format, uint32_t *field_map,
 		/* Empty tuple, nothing to do. */
 		goto skip;
 	}
-	/* first field is simply accessible, so we do not store offset to it */
-	struct tuple_field *field = tuple_format_field(format, 0);
-	if (validate &&
-	    !field_mp_type_is_compatible(field->type, mp_typeof(*pos),
-					 tuple_field_is_nullable(field))) {
-		diag_set(ClientError, ER_FIELD_TYPE, tuple_field_path(field),
-			 field_type_strs[field->type]);
-		goto error;
-	}
-	if (required_fields != NULL)
-		bit_clear(required_fields, field->id);
-	mp_next(&pos);
-	/* other fields...*/
-	uint32_t i = 1;
 	uint32_t defined_field_count = MIN(field_count, validate ?
 					   tuple_format_field_count(format) :
 					   format->index_field_count);
-	if (field_count < format->index_field_count) {
-		/*
-		 * Nullify field map to be able to detect by 0,
-		 * which key fields are absent in tuple_field().
-		 */
-		memset((char *)field_map - format->field_map_size, 0,
-		       format->field_map_size);
-	}
-	for (; i < defined_field_count; ++i) {
-		field = tuple_format_field(format, i);
-		if (validate &&
-		    !field_mp_type_is_compatible(field->type, mp_typeof(*pos),
-						 tuple_field_is_nullable(field))) {
-			diag_set(ClientError, ER_FIELD_TYPE,
-				 tuple_field_path(field),
-				 field_type_strs[field->type]);
-			goto error;
+	/*
+	 * Nullify field map to be able to detect by 0,
+	 * which key fields are absent in tuple_field().
+	 */
+	memset((char *)field_map - format->field_map_size, 0,
+		format->field_map_size);
+	uint32_t mp_frames_sz = format->fields_depth * sizeof(struct mp_frame);
+	struct mp_frame *mp_frames = region_alloc(region, mp_frames_sz);
+	if (mp_frames == NULL) {
+		diag_set(OutOfMemory, mp_frames_sz, "region", "frames");
+		goto error;
+	}
+	struct mp_stack mp_stack;
+	mp_stack_init(&mp_stack, format->fields_depth, mp_frames);
+	mp_stack_push(&mp_stack, MP_ARRAY, defined_field_count);
+	struct tuple_field *field;
+	struct json_token *parent = &format->fields.root;
+	while (1) {
+		while (mp_stack_advance(&mp_stack)) {
+			mp_stack_pop(&mp_stack);
+			if (mp_stack_is_empty(&mp_stack))
+				goto skip;
+			parent = parent->parent;
+		}
+		struct json_token token;
+		token.type = mp_stack_top(&mp_stack)->type == MP_ARRAY ?
+			     JSON_TOKEN_NUM : JSON_TOKEN_STR;
+		if (token.type == JSON_TOKEN_NUM) {
+			token.num = mp_stack_top(&mp_stack)->curr - 1;
+		} else if (token.type == JSON_TOKEN_STR) {
+			if (mp_typeof(*pos) != MP_STR) {
+				/* Skip key. */
+				mp_next(&pos);
+				mp_next(&pos);
+				continue;
+			}
+			token.str = mp_decode_str(&pos, (uint32_t *)&token.len);
+		} else {
+			unreachable();
+		}
+		enum mp_type type = mp_typeof(*pos);
+		assert(parent != NULL);
+		field = json_tree_lookup_entry(&format->fields, parent, &token,
+					       struct tuple_field, token);
+		if (field != NULL) {
+			bool is_nullable = tuple_field_is_nullable(field);
+			if (validate &&
+			    !field_mp_type_is_compatible(field->type, type,
+							 is_nullable) != 0) {
+				diag_set(ClientError, ER_FIELD_TYPE,
+					 tuple_field_path(field),
+					 field_type_strs[field->type]);
+				goto error;
+			}
+			if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+				field_map[field->offset_slot] = pos - tuple;
+			if (required_fields != NULL)
+				bit_clear(required_fields, field->id);
 		}
-		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
-			field_map[field->offset_slot] =
-				(uint32_t) (pos - tuple);
+		if ((type == MP_ARRAY || type == MP_MAP) &&
+		    !mp_stack_is_full(&mp_stack) && field != NULL) {
+			uint32_t size = type == MP_ARRAY ?
+					mp_decode_array(&pos) :
+					mp_decode_map(&pos);
+			mp_stack_push(&mp_stack, type, size);
+			parent = &field->token;
+		} else {
+			mp_next(&pos);
 		}
-		if (required_fields != NULL)
-			bit_clear(required_fields, field->id);
-		mp_next(&pos);
-	}
+	};
 skip:
 	/*
 	 * Check the required field bitmap for missing fields.
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index d32bb91f2..311bacbe8 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -185,6 +185,15 @@ struct tuple_format {
 	 * Shared names storage used by all formats of a space.
 	 */
 	struct tuple_dictionary *dict;
+	/**
+	 * The size of a minimal tuple conforming to the format
+	 * and filled with nils.
+	 */
+	uint32_t min_tuple_size;
+	/**
+	 * A maximum depth of format::fields subtree.
+	 */
+	uint32_t fields_depth;
 	/**
 	 * Fields comprising the format, organized in a tree.
 	 * First level nodes correspond to tuple fields.
@@ -221,6 +230,23 @@ tuple_format_field(struct tuple_format *format, uint32_t fieldno)
 				      &token, struct tuple_field, token);
 }
 
+/**
+ * Return meta information of a tuple field given a format,
+ * field index and path.
+ */
+static inline struct tuple_field *
+tuple_format_field_by_path(struct tuple_format *format, uint32_t fieldno,
+			   const char *path, uint32_t path_len)
+{
+	struct tuple_field *root = tuple_format_field(format, fieldno);
+	assert(root != NULL);
+	if (path == NULL)
+		return root;
+	return json_tree_lookup_path_entry(&format->fields, &root->token,
+					   path, path_len, TUPLE_INDEX_BASE,
+					   struct tuple_field, token);
+}
+
 extern struct tuple_format **tuple_formats;
 
 static inline uint32_t
@@ -416,6 +442,22 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			const uint32_t *field_map, uint32_t fieldno,
 			const char *path, uint32_t path_len)
 {
+	if (likely(path != NULL && fieldno < format->index_field_count)) {
+		/* Indexed field */
+		struct tuple_field *field =
+			tuple_format_field_by_path(format, fieldno,
+						   path, path_len);
+		if (field == NULL)
+			goto parse;
+		assert(field != NULL);
+		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
+			assert(-field->offset_slot * sizeof(uint32_t) <=
+			       format->field_map_size);
+			return field_map[field->offset_slot] == 0 ?
+			       NULL : tuple + field_map[field->offset_slot];
+		}
+	}
+parse:
 	tuple = tuple_field_raw(format, tuple, field_map, fieldno);
 	if (path == NULL)
 		return tuple;
@@ -454,7 +496,8 @@ static inline const char *
 tuple_field_by_part_raw(struct tuple_format *format, const char *data,
 			const uint32_t *field_map, struct key_part *part)
 {
-	return tuple_field_raw(format, data, field_map, part->fieldno);
+	return tuple_field_raw_by_path(format, data, field_map, part->fieldno,
+				       part->path, part->path_len);
 }
 
 #if defined(__cplusplus)
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index b394804fe..3486ce11c 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -222,7 +222,7 @@ key_hash_slowpath(const char *key, struct key_def *key_def);
 
 void
 tuple_hash_func_set(struct key_def *key_def) {
-	if (key_def->is_nullable)
+	if (key_def->is_nullable || key_def->has_json_paths)
 		goto slowpath;
 	/*
 	 * Check that key_def defines sequential a key without holes
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index ca987134c..acd2d7fd6 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -982,6 +982,10 @@ vinyl_index_def_change_requires_rebuild(struct index *index,
 			return true;
 		if (!field_type1_contains_type2(new_part->type, old_part->type))
 			return true;
+		if (json_path_cmp(old_part->path, old_part->path_len,
+				  new_part->path, new_part->path_len,
+				  TUPLE_INDEX_BASE) != 0)
+			return true;
 	}
 	return false;
 }
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index c9e0713c8..0b152b203 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -581,8 +581,9 @@ vy_log_record_decode(struct vy_log_record *record,
 			record->group_id = mp_decode_uint(&pos);
 			break;
 		case VY_LOG_KEY_DEF: {
+			struct region *region = &fiber()->gc;
 			uint32_t part_count = mp_decode_array(&pos);
-			struct key_part_def *parts = region_alloc(&fiber()->gc,
+			struct key_part_def *parts = region_alloc(region,
 						sizeof(*parts) * part_count);
 			if (parts == NULL) {
 				diag_set(OutOfMemory,
@@ -591,7 +592,7 @@ vy_log_record_decode(struct vy_log_record *record,
 				return -1;
 			}
 			if (key_def_decode_parts(parts, part_count, &pos,
-						 NULL, 0) != 0) {
+						 NULL, 0, region) != 0) {
 				diag_log();
 				diag_set(ClientError, ER_INVALID_VYLOG_FILE,
 					 "Bad record: failed to decode "
@@ -705,7 +706,8 @@ vy_log_record_dup(struct region *pool, const struct vy_log_record *src)
 				 "struct key_part_def");
 			goto err;
 		}
-		key_def_dump_parts(src->key_def, dst->key_parts);
+		if (key_def_dump_parts(src->key_def, dst->key_parts, pool) != 0)
+			goto err;
 		dst->key_part_count = src->key_def->part_count;
 		dst->key_def = NULL;
 	}
diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c
index ddbc2d46f..23b7a5869 100644
--- a/src/box/vy_point_lookup.c
+++ b/src/box/vy_point_lookup.c
@@ -196,7 +196,8 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx,
 		const struct vy_read_view **rv,
 		struct tuple *key, struct tuple **ret)
 {
-	assert(tuple_field_count(key) >= lsm->cmp_def->part_count);
+	assert(tuple_field_count(key) >= lsm->cmp_def->part_count ||
+	       lsm->cmp_def->has_json_paths);
 
 	*ret = NULL;
 	double start_time = ev_monotonic_now(loop());
diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c
index 47f135c65..566183f45 100644
--- a/src/box/vy_stmt.c
+++ b/src/box/vy_stmt.c
@@ -385,26 +385,33 @@ vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
 	struct region *region = &fiber()->gc;
 
 	uint32_t field_count = format->index_field_count;
-	struct iovec *iov = region_alloc(region, sizeof(*iov) * field_count);
+	uint32_t iov_sz = sizeof(struct iovec) * format->total_field_count;
+	struct iovec *iov = region_alloc(region, iov_sz);
 	if (iov == NULL) {
-		diag_set(OutOfMemory, sizeof(*iov) * field_count,
-			 "region", "iov for surrogate key");
+		diag_set(OutOfMemory, iov_sz, "region",
+			 "iov for surrogate key");
 		return NULL;
 	}
-	memset(iov, 0, sizeof(*iov) * field_count);
+	memset(iov, 0, iov_sz);
 	uint32_t part_count = mp_decode_array(&key);
 	assert(part_count == cmp_def->part_count);
-	assert(part_count <= field_count);
-	uint32_t nulls_count = field_count - cmp_def->part_count;
-	uint32_t bsize = mp_sizeof_array(field_count) +
-			 mp_sizeof_nil() * nulls_count;
+	assert(part_count <= format->total_field_count);
+	/**
+	 * Calculate bsize using format::min_tuple_size tuple
+	 * where parts_count nulls replaced with extracted keys.
+	 */
+	uint32_t bsize = format->min_tuple_size - mp_sizeof_nil() * part_count;
 	for (uint32_t i = 0; i < part_count; ++i) {
 		const struct key_part *part = &cmp_def->parts[i];
 		assert(part->fieldno < field_count);
+		struct tuple_field *field =
+			tuple_format_field_by_path(format, part->fieldno,
+						   part->path, part->path_len);
+		assert(field != NULL);
 		const char *svp = key;
-		iov[part->fieldno].iov_base = (char *) key;
+		iov[field->id].iov_base = (char *) key;
 		mp_next(&key);
-		iov[part->fieldno].iov_len = key - svp;
+		iov[field->id].iov_len = key - svp;
 		bsize += key - svp;
 	}
 
@@ -414,16 +421,47 @@ vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
 
 	char *raw = (char *) tuple_data(stmt);
 	uint32_t *field_map = (uint32_t *) raw;
+	memset((char *)field_map - format->field_map_size, 0,
+	       format->field_map_size);
 	char *wpos = mp_encode_array(raw, field_count);
-	for (uint32_t i = 0; i < field_count; ++i) {
-		struct tuple_field *field = tuple_format_field(format, i);
-		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
-			field_map[field->offset_slot] = wpos - raw;
-		if (iov[i].iov_base == NULL) {
-			wpos = mp_encode_nil(wpos);
+
+	struct tuple_field *field;
+	json_tree_foreach_entry_preorder(field, &format->fields.root,
+					 struct tuple_field, token) {
+		if (field->token.type == JSON_TOKEN_NUM) {
+			/*
+			 * Write nil istead of omitted array
+			 * members.
+			 */
+			struct json_token **neighbors =
+				field->token.parent->children;
+			for (int i = field->token.num - 1;
+			     i > 0 && neighbors[i] == NULL; i--)
+				wpos = mp_encode_nil(wpos);
 		} else {
-			memcpy(wpos, iov[i].iov_base, iov[i].iov_len);
-			wpos += iov[i].iov_len;
+			/* Write a key string for map member. */
+			assert(field->token.type == JSON_TOKEN_STR);
+			const char *str = field->token.str;
+			uint32_t len = field->token.len;
+			wpos = mp_encode_str(wpos, str, len);
+		}
+		int max_child_idx = field->token.max_child_idx;
+		if (json_token_is_leaf(&field->token)) {
+			if (iov[field->id].iov_len == 0) {
+				wpos = mp_encode_nil(wpos);
+			} else {
+				memcpy(wpos, iov[field->id].iov_base,
+				       iov[field->id].iov_len);
+				uint32_t data_offset = wpos - raw;
+				int32_t slot = field->offset_slot;
+				if (slot != TUPLE_OFFSET_SLOT_NIL)
+					field_map[slot] = data_offset;
+				wpos += iov[field->id].iov_len;
+			}
+		} else if (field->type == FIELD_TYPE_ARRAY) {
+			wpos = mp_encode_array(wpos, max_child_idx + 1);
+		} else if (field->type == FIELD_TYPE_MAP) {
+			wpos = mp_encode_map(wpos, max_child_idx + 1);
 		}
 	}
 	assert(wpos == raw + bsize);
diff --git a/test/engine/json.result b/test/engine/json.result
new file mode 100644
index 000000000..9e5accb0b
--- /dev/null
+++ b/test/engine/json.result
@@ -0,0 +1,452 @@
+test_run = require('test_run').new()
+---
+...
+engine = test_run:get_cfg('engine')
+---
+...
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO["fname"]'}, {3, 'str', path = '["FIO"].fname'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': same key
+    part is indexed twice'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '["FIO"]["fname"]'}}})
+---
+- error: 'Wrong index options (field 2): ''path'' must be string'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = 'FIO'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+    ''map'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[1]'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+    ''array'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO'}, {3, 'str', path = 'FIO.fname'}}})
+---
+- error: Field [3]["FIO"] has type 'string' in one index, but type 'map' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[1].sname'}, {3, 'str', path = '["FIO"].fname'}}})
+---
+- error: Field 3 has type 'array' in one index, but type 'map' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO....fname'}}})
+---
+- error: 'Wrong index options (field 3): invalid path'
+...
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO.fname', is_nullable = false}, {3, 'str', path = '["FIO"]["sname"]'}}})
+---
+...
+idx ~= nil
+---
+- true
+...
+idx.parts[2].path == 'FIO.fname'
+---
+- true
+...
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'array'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+---
+...
+s:format(format)
+---
+- error: Field 3 has type 'array' in one index, but type 'map' in another
+...
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'map'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+---
+...
+s:format(format)
+---
+...
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: Field [3]["FIO"]["fname"] has type 'string' in one index, but type 'number'
+    in another
+...
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+---
+- error: 'Tuple field [3]["FIO"] type does not match one required by operation: expected
+    map'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+---
+- error: 'Tuple field [3]["FIO"]["fname"] type does not match one required by operation:
+    expected string'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+---
+- error: Tuple field [3]["FIO"]["sname"] required by space format is missing
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+  4, 5]
+...
+idx:select()
+---
+- - [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+  - [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+    4, 5]
+...
+idx:min()
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:max()
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+  4, 5]
+...
+s:drop()
+---
+...
+s = box.schema.create_space('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[2]'}
+---
+...
+pk = s:create_index('pk', {parts = parts})
+---
+...
+s:insert{{1, 2}, 3}
+---
+- [[1, 2], 3]
+...
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+---
+...
+s:get(2)
+---
+- [[1, 2], 8]
+...
+s:drop()
+---
+...
+-- Create index on space with data
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk = s:create_index('primary', { type = 'tree' })
+---
+...
+s:insert{1, 7, {town = 'London', FIO = 1234}, 4, 5}
+---
+- [1, 7, {'town': 'London', 'FIO': 1234}, 4, 5]
+...
+s:insert{2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [2, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+---
+- [4, 7, {'town': 'London', 'FIO': [1, 2, 3]}, 4, 5]
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: 'Tuple field [3]["FIO"] type does not match one required by operation: expected
+    map'
+...
+_ = s:delete(1)
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+_ = s:delete(2)
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+---
+- error: 'Tuple field [3]["FIO"] type does not match one required by operation: expected
+    map'
+...
+_ = s:delete(4)
+---
+...
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]', is_nullable = true}, {3, 'str', path = '["FIO"]["sname"]'}, {3, 'str', path = '["FIO"]["extra"]', is_nullable = true}}})
+---
+...
+idx ~= nil
+---
+- true
+...
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = '["FIO"]["fname"]'}}})
+---
+- error: Field [3]["FIO"]["fname"] has type 'string' in one index, but type 'number'
+    in another
+...
+idx2 = s:create_index('test2', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}}})
+---
+...
+idx2 ~= nil
+---
+- true
+...
+t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+---
+...
+-- Test field_map in tuple speed-up access by indexed path.
+t["[3][\"FIO\"][\"fname\"]"]
+---
+- Agent
+...
+idx:select()
+---
+- - [5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4, 5]
+  - [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:min()
+---
+- [5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4, 5]
+...
+idx:max()
+---
+- [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:drop()
+---
+...
+s:drop()
+---
+...
+-- Test complex JSON indexes
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'str', path='[3][2].a'}
+---
+...
+parts[2] = {1, 'unsigned', path = '[3][1]'}
+---
+...
+parts[3] = {2, 'str', path = '[2].d[1]'}
+---
+...
+pk = s:create_index('primary', { type = 'tree', parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+---
+- [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6,
+  [1, 2, 3]]
+...
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+---
+- error: Duplicate key exists in unique index 'primary' in space 'withdata'
+...
+parts = {}
+---
+...
+parts[1] = {4, 'unsigned', path='[1]', is_nullable = false}
+---
+...
+parts[2] = {4, 'unsigned', path='[2]', is_nullable = true}
+---
+...
+parts[3] = {4, 'unsigned', path='[4]', is_nullable = true}
+---
+...
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+---
+- error: Tuple field [4][1] required by space format is missing
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[3][2].b' }
+---
+...
+parts[2] = {3, 'unsigned'}
+---
+...
+crosspart_idx = s:create_index('crosspart', { parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[3][2].b'}
+---
+...
+num_idx = s:create_index('numeric', {parts =  parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:get(2)
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+num_idx:select()
+---
+- - [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+      9, 2, 3]]
+  - [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}],
+    6, [1, 2, 3]]
+  - [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+      0]]
+...
+num_idx:max()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:min()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+crosspart_idx:max() == num_idx:max()
+---
+- true
+...
+crosspart_idx:min() == num_idx:min()
+---
+- true
+...
+trap_idx:max()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+    2, 3]]
+...
+trap_idx:min()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+s:drop()
+---
+...
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk_simplified = s:create_index('primary', { type = 'tree',  parts = {{1, 'unsigned'}}})
+---
+...
+pk_simplified.path == box.NULL
+---
+- true
+...
+idx = s:create_index('idx', {parts = {{2, 'integer', path = 'a'}}})
+---
+...
+s:insert{31, {a = 1, aa = -1}}
+---
+- [31, {'a': 1, 'aa': -1}]
+...
+s:insert{22, {a = 2, aa = -2}}
+---
+- [22, {'a': 2, 'aa': -2}]
+...
+s:insert{13, {a = 3, aa = -3}}
+---
+- [13, {'a': 3, 'aa': -3}]
+...
+idx:select()
+---
+- - [31, {'a': 1, 'aa': -1}]
+  - [22, {'a': 2, 'aa': -2}]
+  - [13, {'a': 3, 'aa': -3}]
+...
+idx:alter({parts = {{2, 'integer', path = 'aa'}}})
+---
+...
+idx:select()
+---
+- - [13, {'a': 3, 'aa': -3}]
+  - [22, {'a': 2, 'aa': -2}]
+  - [31, {'a': 1, 'aa': -1}]
+...
+s:drop()
+---
+...
+-- incompatible format change
+s = box.schema.space.create('test')
+---
+...
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1]'}}})
+---
+...
+s:insert{{-1}}
+---
+- [[-1]]
+...
+i:alter{parts = {{1, 'string', path = '[1]'}}}
+---
+- error: 'Tuple field [1][1] type does not match one required by operation: expected
+    string'
+...
+s:insert{{'a'}}
+---
+- error: 'Tuple field [1][1] type does not match one required by operation: expected
+    integer'
+...
+i:drop()
+---
+...
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1].FIO'}}})
+---
+...
+s:insert{{{FIO=-1}}}
+---
+- [[{'FIO': -1}]]
+...
+i:alter{parts = {{1, 'integer', path = '[1][1]'}}}
+---
+- error: 'Tuple field [1][1] type does not match one required by operation: expected
+    array'
+...
+i:alter{parts = {{1, 'integer', path = '[1].FIO[1]'}}}
+---
+- error: 'Tuple field [1][1]["FIO"] type does not match one required by operation:
+    expected array'
+...
+s:drop()
+---
+...
+engine = nil
+---
+...
+test_run = nil
+---
+...
diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
new file mode 100644
index 000000000..a876ff1ed
--- /dev/null
+++ b/test/engine/json.test.lua
@@ -0,0 +1,131 @@
+test_run = require('test_run').new()
+engine = test_run:get_cfg('engine')
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO["fname"]'}, {3, 'str', path = '["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '["FIO"]["fname"]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = 'FIO'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[1]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO'}, {3, 'str', path = 'FIO.fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[1].sname'}, {3, 'str', path = '["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO....fname'}}})
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO.fname', is_nullable = false}, {3, 'str', path = '["FIO"]["sname"]'}}})
+idx ~= nil
+idx.parts[2].path == 'FIO.fname'
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'array'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+s:format(format)
+format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'map'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
+s:format(format)
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+idx:select()
+idx:min()
+idx:max()
+s:drop()
+
+s = box.schema.create_space('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'unsigned', path='[2]'}
+pk = s:create_index('pk', {parts = parts})
+s:insert{{1, 2}, 3}
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+s:get(2)
+s:drop()
+
+-- Create index on space with data
+s = box.schema.space.create('withdata', {engine = engine})
+pk = s:create_index('primary', { type = 'tree' })
+s:insert{1, 7, {town = 'London', FIO = 1234}, 4, 5}
+s:insert{2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+_ = s:delete(1)
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+_ = s:delete(2)
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+_ = s:delete(4)
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]', is_nullable = true}, {3, 'str', path = '["FIO"]["sname"]'}, {3, 'str', path = '["FIO"]["extra"]', is_nullable = true}}})
+idx ~= nil
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = '["FIO"]["fname"]'}}})
+idx2 = s:create_index('test2', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}}})
+idx2 ~= nil
+t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+-- Test field_map in tuple speed-up access by indexed path.
+t["[3][\"FIO\"][\"fname\"]"]
+idx:select()
+idx:min()
+idx:max()
+idx:drop()
+s:drop()
+
+-- Test complex JSON indexes
+s = box.schema.space.create('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'str', path='[3][2].a'}
+parts[2] = {1, 'unsigned', path = '[3][1]'}
+parts[3] = {2, 'str', path = '[2].d[1]'}
+pk = s:create_index('primary', { type = 'tree', parts =  parts})
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+parts = {}
+parts[1] = {4, 'unsigned', path='[1]', is_nullable = false}
+parts[2] = {4, 'unsigned', path='[2]', is_nullable = true}
+parts[3] = {4, 'unsigned', path='[4]', is_nullable = true}
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[3][2].b' }
+parts[2] = {3, 'unsigned'}
+crosspart_idx = s:create_index('crosspart', { parts =  parts})
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[3][2].b'}
+num_idx = s:create_index('numeric', {parts =  parts})
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+num_idx:get(2)
+num_idx:select()
+num_idx:max()
+num_idx:min()
+crosspart_idx:max() == num_idx:max()
+crosspart_idx:min() == num_idx:min()
+trap_idx:max()
+trap_idx:min()
+s:drop()
+
+s = box.schema.space.create('withdata', {engine = engine})
+pk_simplified = s:create_index('primary', { type = 'tree',  parts = {{1, 'unsigned'}}})
+pk_simplified.path == box.NULL
+idx = s:create_index('idx', {parts = {{2, 'integer', path = 'a'}}})
+s:insert{31, {a = 1, aa = -1}}
+s:insert{22, {a = 2, aa = -2}}
+s:insert{13, {a = 3, aa = -3}}
+idx:select()
+idx:alter({parts = {{2, 'integer', path = 'aa'}}})
+idx:select()
+s:drop()
+
+-- incompatible format change
+s = box.schema.space.create('test')
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1]'}}})
+s:insert{{-1}}
+i:alter{parts = {{1, 'string', path = '[1]'}}}
+s:insert{{'a'}}
+i:drop()
+i = s:create_index('pk', {parts = {{1, 'integer', path = '[1].FIO'}}})
+s:insert{{{FIO=-1}}}
+i:alter{parts = {{1, 'integer', path = '[1][1]'}}}
+i:alter{parts = {{1, 'integer', path = '[1].FIO[1]'}}}
+s:drop()
+
+engine = nil
+test_run = nil
+
-- 
2.19.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 4/5] box: introduce offset_slot cache in key_part
  2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
                   ` (2 preceding siblings ...)
  2019-01-16 13:44 ` [PATCH v8 3/5] box: introduce JSON Indexes Kirill Shcherbatov
@ 2019-01-16 13:44 ` Kirill Shcherbatov
  2019-01-23 14:23   ` Vladimir Davydov
  2019-01-16 13:44 ` [PATCH v8 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
  2019-01-16 15:35 ` [tarantool-patches] [PATCH v8 6/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
  5 siblings, 1 reply; 13+ messages in thread
From: Kirill Shcherbatov @ 2019-01-16 13:44 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: kostja, Kirill Shcherbatov

tuple_field_by_part looks up the tuple_field corresponding to the
given key part in tuple_format in order to quickly retrieve the offset
of indexed data from the tuple field map. For regular indexes this
operation is blazing fast, however of JSON indexes it is not as we
have to parse the path to data and then do multiple lookups in a JSON
tree. Since tuple_field_by_part is used by comparators, we should
strive to make this routine as fast as possible for all kinds of
indexes.

This patch introduces an optimization that is supposed to make
tuple_field_by_part for JSON indexes as fast as it is for regular
indexes in most cases. We do that by caching the offset slot right in
key_part. There's a catch here however - we create a new format
whenever an index is dropped or created and we don't reindex old
tuples. As a result, there may be several generations of tuples in the
same space, all using different formats while there's the only key_def
used for comparison.

To overcome this problem, we introduce the notion of tuple_format
epoch. This is a counter incremented each time a new format is
created. We store it in tuple_format and key_def, and we only use
the offset slot cached in a key_def if it's epoch coincides with the
epoch of the tuple format. If they don't, we look up a tuple_field as
before, and then update the cached value provided the epoch of the
tuple format.

Part of #1012
---
 src/box/key_def.c      | 15 ++++++++++-----
 src/box/key_def.h      | 14 ++++++++++++++
 src/box/tuple.h        |  2 +-
 src/box/tuple_format.c |  4 +++-
 src/box/tuple_format.h | 36 +++++++++++++++++++++++++++++++++---
 5 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/src/box/key_def.c b/src/box/key_def.c
index 0ed497dfc..b5e93f117 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -139,7 +139,8 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 		 enum field_type type, enum on_conflict_action nullable_action,
 		 struct coll *coll, uint32_t coll_id,
 		 enum sort_order sort_order, const char *path,
-		 uint32_t path_len, char **path_pool)
+		 uint32_t path_len, char **path_pool, int32_t offset_slot_cache,
+		 uint64_t format_epoch)
 {
 	assert(part_no < def->part_count);
 	assert(type < field_type_MAX);
@@ -151,6 +152,8 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
 	def->parts[part_no].coll = coll;
 	def->parts[part_no].coll_id = coll_id;
 	def->parts[part_no].sort_order = sort_order;
+	def->parts[part_no].offset_slot_cache = offset_slot_cache;
+	def->parts[part_no].format_epoch = format_epoch;
 	if (path != NULL) {
 		assert(path_pool != NULL);
 		def->parts[part_no].path = *path_pool;
@@ -199,7 +202,7 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count)
 		key_def_set_part(def, i, part->fieldno, part->type,
 				 part->nullable_action, coll, part->coll_id,
 				 part->sort_order, part->path, path_len,
-				 &path_pool);
+				 &path_pool, TUPLE_OFFSET_SLOT_NIL, 0);
 	}
 	key_def_set_cmp(def);
 	return def;
@@ -252,7 +255,7 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
 				 (enum field_type)types[item],
 				 ON_CONFLICT_ACTION_DEFAULT,
 				 NULL, COLL_NONE, SORT_ORDER_ASC, NULL, 0,
-				 NULL);
+				 NULL, TUPLE_OFFSET_SLOT_NIL, 0);
 	}
 	key_def_set_cmp(key_def);
 	return key_def;
@@ -662,7 +665,8 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
 				 part->coll_id, part->sort_order, part->path,
-				 part->path_len, &path_pool);
+				 part->path_len, &path_pool,
+				 part->offset_slot_cache, part->format_epoch);
 	}
 
 	/* Set-append second key def's part to the new key def. */
@@ -674,7 +678,8 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
 		key_def_set_part(new_def, pos++, part->fieldno, part->type,
 				 part->nullable_action, part->coll,
 				 part->coll_id, part->sort_order, part->path,
-				 part->path_len, &path_pool);
+				 part->path_len, &path_pool,
+				 part->offset_slot_cache, part->format_epoch);
 	}
 	key_def_set_cmp(new_def);
 	return new_def;
diff --git a/src/box/key_def.h b/src/box/key_def.h
index fe4acffb5..7a71ed060 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -97,6 +97,20 @@ struct key_part {
 	char *path;
 	/** The length of JSON path. */
 	uint32_t path_len;
+	/**
+	 * Epoch of the tuple format the offset slot cached in
+	 * this part is valid for, see tuple_format::epoch.
+	 */
+	uint64_t format_epoch;
+	/**
+	 * Cached value of the offset slot corresponding to
+	 * the indexed field (tuple_field::offset_slot).
+	 * Valid only if key_def::epoch equals the epoch of
+	 * the tuple format. This value is updated in
+	 * tuple_field_by_part_raw to always store the
+	 * offset corresponding to the last used tuple format.
+	 */
+	int32_t offset_slot_cache;
 };
 
 struct key_def;
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 4368dac4e..f8a4e6f22 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -527,7 +527,7 @@ tuple_field_by_path(const struct tuple *tuple, uint32_t fieldno,
 {
 	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
 				       tuple_field_map(tuple), fieldno,
-				       path, path_len);
+				       path, path_len, NULL);
 }
 
 /**
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 9af9c307a..3cec1cc44 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -39,6 +39,7 @@ struct tuple_format **tuple_formats;
 static intptr_t recycled_format_ids = FORMAT_ID_NIL;
 
 static uint32_t formats_size = 0, formats_capacity = 0;
+static uint64_t formats_epoch = 0;
 
 static struct tuple_field *
 tuple_field_new(void)
@@ -540,6 +541,7 @@ tuple_format_new(struct tuple_format_vtab *vtab, struct key_def * const *keys,
 	format->vtab = *vtab;
 	format->engine = NULL;
 	format->is_temporary = false;
+	format->epoch = ++formats_epoch;
 	if (tuple_format_register(format) < 0) {
 		tuple_format_destroy(format);
 		free(format);
@@ -968,5 +970,5 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
 	}
 	return tuple_field_raw_by_path(format, tuple, field_map, fieldno,
 				       path + lexer.offset,
-				       path_len - lexer.offset);
+				       path_len - lexer.offset, NULL);
 }
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index 311bacbe8..fce0b9c93 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -137,6 +137,12 @@ tuple_field_is_nullable(struct tuple_field *tuple_field)
  * Tuple format describes how tuple is stored and information about its fields
  */
 struct tuple_format {
+	/**
+	 * Counter that grows incrementally on space rebuild
+	 * used for caching offset slot in key_part, for more
+	 * details see key_part::offset_slot_cache.
+	 */
+	uint64_t epoch;
 	/** Virtual function table */
 	struct tuple_format_vtab vtab;
 	/** Pointer to engine-specific data. */
@@ -436,12 +442,17 @@ tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len);
 /**
  * Get a field at the specific position in this MessagePack
  * array by fieldno and path.
+ * The offset_slot[out] may be specified to save it on use,
+ * set NULL otherwise.
  */
 static inline const char *
 tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			const uint32_t *field_map, uint32_t fieldno,
-			const char *path, uint32_t path_len)
+			const char *path, uint32_t path_len,
+			int32_t *offset_slot)
 {
+	if (offset_slot != NULL)
+		*offset_slot = TUPLE_OFFSET_SLOT_NIL;
 	if (likely(path != NULL && fieldno < format->index_field_count)) {
 		/* Indexed field */
 		struct tuple_field *field =
@@ -450,6 +461,8 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 		if (field == NULL)
 			goto parse;
 		assert(field != NULL);
+		if (offset_slot != NULL)
+			*offset_slot = field->offset_slot;
 		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
 			assert(-field->offset_slot * sizeof(uint32_t) <=
 			       format->field_map_size);
@@ -496,8 +509,25 @@ static inline const char *
 tuple_field_by_part_raw(struct tuple_format *format, const char *data,
 			const uint32_t *field_map, struct key_part *part)
 {
-	return tuple_field_raw_by_path(format, data, field_map, part->fieldno,
-				       part->path, part->path_len);
+	if (likely(part->format_epoch == format->epoch)) {
+		int32_t offset_slot = part->offset_slot_cache;
+		assert(-offset_slot * sizeof(uint32_t) <=
+		       format->field_map_size);
+		return field_map[offset_slot] != 0 ?
+		       data + field_map[offset_slot] : NULL;
+	} else {
+		assert(format->epoch != 0);
+		int32_t offset_slot;
+		const char *field =
+			tuple_field_raw_by_path(format, data, field_map,
+						part->fieldno, part->path,
+						part->path_len, &offset_slot);
+		if (offset_slot != TUPLE_OFFSET_SLOT_NIL) {
+			part->format_epoch = format->epoch;
+			part->offset_slot_cache = offset_slot;
+		}
+		return field;
+	}
 }
 
 #if defined(__cplusplus)
-- 
2.19.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 5/5] box: specify indexes in user-friendly form
  2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
                   ` (3 preceding siblings ...)
  2019-01-16 13:44 ` [PATCH v8 4/5] box: introduce offset_slot cache in key_part Kirill Shcherbatov
@ 2019-01-16 13:44 ` Kirill Shcherbatov
  2019-01-23 15:29   ` Vladimir Davydov
  2019-01-16 15:35 ` [tarantool-patches] [PATCH v8 6/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
  5 siblings, 1 reply; 13+ messages in thread
From: Kirill Shcherbatov @ 2019-01-16 13:44 UTC (permalink / raw)
  To: tarantool-patches, vdavydov.dev; +Cc: kostja, Kirill Shcherbatov

Implemented a more convenient interface for creating an index
by JSON path. Instead of specifying fieldno and relative path
it comes possible to pass full JSON path to data.

Closes #1012

@TarantoolBot document
Title: Indexes by JSON path
Sometimes field data could have complex document structure.
When this structure is consistent across whole space,
you are able to create an index by JSON path.

Example:
s = box.schema.space.create('sample')
format = {{'id', 'unsigned'}, {'data', 'map'}}
s:format(format)
-- explicit JSON index creation
age_idx = s:create_index('age', {{2, 'number', path = "age"}})
-- user-friendly syntax for JSON index creation
parts = {{'data.FIO["fname"]', 'str'}, {'data.FIO["sname"]', 'str'},
         {'data.age', 'number'}}
info_idx = s:create_index('info', {parts = parts}})
s:insert({1, {FIO={fname="James", sname="Bond"}, age=35}})
---
 src/box/lua/schema.lua    | 60 ++++++++++++++++++++++++++++++++++-----
 test/engine/json.result   | 42 +++++++++++++++++++++++++++
 test/engine/json.test.lua | 12 ++++++++
 3 files changed, 107 insertions(+), 7 deletions(-)

diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
index 8a804f0ba..a55ef5b22 100644
--- a/src/box/lua/schema.lua
+++ b/src/box/lua/schema.lua
@@ -575,6 +575,49 @@ local function update_index_parts_1_6_0(parts)
     return result
 end
 
+local function format_field_index_by_name(format, name)
+    for k,v in pairs(format) do
+        if v.name == name then
+            return k
+        end
+    end
+    return nil
+end
+
+local function format_field_resolve(format, name)
+    local idx = nil
+    local field_name = nil
+    -- try resolve whole name
+    idx = format_field_index_by_name(format, name)
+    if idx ~= nil then
+        print("Case 1 "..idx)
+        return idx, nil
+    end
+    -- try resolve [%d]
+    field_name = string.match(name, "^%[(%d+)%]")
+    idx = tonumber(field_name)
+    if idx ~= nil then
+        print("Case 2 "..idx..string.sub(name, string.len(field_name) + 3))
+        return idx, string.sub(name, string.len(field_name) + 3)
+    end
+    -- try resolve ["%s"] or ['%s']
+    field_name = string.match(name, "^%[\"(.*)\"%]") or
+                 string.match(name, "^%[\'(.*)\'%]")
+    idx = format_field_index_by_name(format, field_name)
+    if idx ~= nil then
+        print("Case 3 "..idx..string.sub(name, string.len(field_name) + 5))
+        return idx, string.sub(name, string.len(field_name) + 5)
+    end
+    -- try to resolve .*[ and .*.
+    field_name = string.match(name, "^([^.%[]-)[.%[]")
+    idx = format_field_index_by_name(format, field_name)
+    if idx ~= nil then
+        print("Case 4 "..idx..string.sub(name, string.len(field_name) + 1))
+        return idx, string.sub(name, string.len(field_name) + 1)
+    end
+    return nil, nil
+end
+
 local function update_index_parts(format, parts)
     if type(parts) ~= "table" then
         box.error(box.error.ILLEGAL_PARAMS,
@@ -626,16 +669,19 @@ local function update_index_parts(format, parts)
             box.error(box.error.ILLEGAL_PARAMS,
                       "options.parts[" .. i .. "]: field (name or number) is expected")
         elseif type(part.field) == 'string' then
-            for k,v in pairs(format) do
-                if v.name == part.field then
-                    part.field = k
-                    break
-                end
+            local idx, path = format_field_resolve(format, part.field)
+            if idx == nil then
+               box.error(box.error.ILLEGAL_PARAMS,
+                         "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")
             end
-            if type(part.field) == 'string' then
+            if part.path ~= nil and part.path ~= path then
                 box.error(box.error.ILLEGAL_PARAMS,
-                          "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")
+                          "options.parts[" .. i .. "]: field path '"..
+                          path.." doesn't math the path '" ..part.path.."'")
             end
+            parts_can_be_simplified = parts_can_be_simplified and path == nil
+            part.field = idx
+            part.path = path or part.path
         elseif part.field == 0 then
             box.error(box.error.ILLEGAL_PARAMS,
                       "options.parts[" .. i .. "]: field (number) must be one-based")
diff --git a/test/engine/json.result b/test/engine/json.result
index 9e5accb0b..32f787a3d 100644
--- a/test/engine/json.result
+++ b/test/engine/json.result
@@ -70,6 +70,48 @@ s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname
 - error: Field [3]["FIO"]["fname"] has type 'string' in one index, but type 'number'
     in another
 ...
+s:create_index('test3', {parts = {{2, 'number'}, {']sad.FIO["fname"]', 'str'}}})
+---
+- error: 'Illegal parameters, options.parts[2]: field was not found by name '']sad.FIO["fname"]'''
+...
+s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str', path = "FIO[\"sname\"]"}}})
+---
+- error: 'Illegal parameters, options.parts[2]: field path ''.FIO["fname"] doesn''t
+    math the path ''FIO["sname"]'''
+...
+idx3 = s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str'}}})
+---
+...
+idx3 ~= nil
+---
+- true
+...
+idx3.parts[2].path == ".FIO[\"fname\"]"
+---
+- true
+...
+s:create_index('test4', {parts = {{2, 'number'}, {'invalid.FIO["fname"]', 'str'}}})
+---
+- error: 'Illegal parameters, options.parts[2]: field was not found by name ''invalid.FIO["fname"]'''
+...
+idx4 = s:create_index('test4', {parts = {{2, 'number'}, {'data.FIO["fname"]', 'str'}}})
+---
+...
+idx4 ~= nil
+---
+- true
+...
+idx4.parts[2].path == ".FIO[\"fname\"]"
+---
+- true
+...
+-- Vinyl has optimizations that omit index checks, so errors could differ.
+idx3:drop()
+---
+...
+idx4:drop()
+---
+...
 s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
 ---
 - error: 'Tuple field [3]["FIO"] type does not match one required by operation: expected
diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
index a876ff1ed..cfed275b0 100644
--- a/test/engine/json.test.lua
+++ b/test/engine/json.test.lua
@@ -19,6 +19,18 @@ s:format(format)
 format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'map'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
 s:format(format)
 s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname'}, {3, 'str', path = '["FIO"]["sname"]'}}})
+s:create_index('test3', {parts = {{2, 'number'}, {']sad.FIO["fname"]', 'str'}}})
+s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str', path = "FIO[\"sname\"]"}}})
+idx3 = s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str'}}})
+idx3 ~= nil
+idx3.parts[2].path == ".FIO[\"fname\"]"
+s:create_index('test4', {parts = {{2, 'number'}, {'invalid.FIO["fname"]', 'str'}}})
+idx4 = s:create_index('test4', {parts = {{2, 'number'}, {'data.FIO["fname"]', 'str'}}})
+idx4 ~= nil
+idx4.parts[2].path == ".FIO[\"fname\"]"
+-- Vinyl has optimizations that omit index checks, so errors could differ.
+idx3:drop()
+idx4:drop()
 s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
 s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
 s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
-- 
2.19.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tarantool-patches] [PATCH v8 6/6] box: introduce has_json_paths flag in templates
  2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
                   ` (4 preceding siblings ...)
  2019-01-16 13:44 ` [PATCH v8 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
@ 2019-01-16 15:35 ` Kirill Shcherbatov
  2019-01-23 14:15   ` Vladimir Davydov
  5 siblings, 1 reply; 13+ messages in thread
From: Kirill Shcherbatov @ 2019-01-16 15:35 UTC (permalink / raw)
  To: tarantool-patches, Vladimir Davydov

Introduced has_json_path flag for compare, hash and extract
functions(that are really hot) to make possible do not look to
path field for flat indexes without any JSON paths.

Part of #1012
---
 src/box/tuple_compare.cc     | 112 +++++++++++++++++++++++++----------
 src/box/tuple_extract_key.cc | 104 ++++++++++++++++++++------------
 src/box/tuple_hash.cc        |  45 ++++++++++----
 3 files changed, 182 insertions(+), 79 deletions(-)

diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index 7ab6e3bf6..dff3aebbd 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -458,11 +458,12 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	return i;
 }
 
-template<bool is_nullable, bool has_optional_parts>
+template<bool is_nullable, bool has_optional_parts, bool has_json_path>
 static inline int
 tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		       struct key_def *key_def)
 {
+	assert(has_json_path == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
@@ -508,10 +509,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 		end = part + key_def->part_count;
 
 	for (; part < end; part++) {
-		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
-						  field_map_a, part);
-		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
-						  field_map_b, part);
+		if (!has_json_path) {
+			field_a = tuple_field_raw(format_a, tuple_a_raw,
+						  field_map_a,
+						  part->fieldno);
+			field_b = tuple_field_raw(format_b, tuple_b_raw,
+						  field_map_b,
+						  part->fieldno);
+		} else {
+			field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+							  field_map_a, part);
+			field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+							  field_map_b, part);
+		}
 		assert(has_optional_parts ||
 		       (field_a != NULL && field_b != NULL));
 		if (! is_nullable) {
@@ -558,10 +568,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	 */
 	end = key_def->parts + key_def->part_count;
 	for (; part < end; ++part) {
-		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
-						  field_map_a, part);
-		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
-						  field_map_b, part);
+		if (!has_json_path) {
+			field_a = tuple_field_raw(format_a, tuple_a_raw,
+						  field_map_a,
+						  part->fieldno);
+			field_b = tuple_field_raw(format_b, tuple_b_raw,
+						  field_map_b,
+						  part->fieldno);
+		} else {
+			field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
+							  field_map_a, part);
+			field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
+							  field_map_b, part);
+		}
 		/*
 		 * Extended parts are primary, and they can not
 		 * be absent or be NULLs.
@@ -575,11 +594,12 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
 	return 0;
 }
 
-template<bool is_nullable, bool has_optional_parts>
+template<bool is_nullable, bool has_optional_parts, bool has_json_paths>
 static inline int
 tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 				uint32_t part_count, struct key_def *key_def)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || is_nullable);
 	assert(is_nullable == key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
@@ -591,9 +611,14 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	const uint32_t *field_map = tuple_field_map(tuple);
 	enum mp_type a_type, b_type;
 	if (likely(part_count == 1)) {
-		const char *field =
-			tuple_field_by_part_raw(format, tuple_raw, field_map,
-						part);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, tuple_raw, field_map,
+						part->fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, tuple_raw,
+							field_map, part);
+		}
 		if (! is_nullable) {
 			return tuple_compare_field(field, key, part->type,
 						   part->coll);
@@ -617,9 +642,14 @@ tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
 	struct key_part *end = part + part_count;
 	int rc;
 	for (; part < end; ++part, mp_next(&key)) {
-		const char *field =
-			tuple_field_by_part_raw(format, tuple_raw,
-						field_map, part);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, tuple_raw, field_map,
+						part->fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, tuple_raw,
+							field_map, part);
+		}
 		if (! is_nullable) {
 			rc = tuple_compare_field(field, key, part->type,
 						 part->coll);
@@ -1012,19 +1042,31 @@ static const comparator_signature cmp_arr[] = {
 
 #undef COMPARATOR
 
+static const tuple_compare_t compare_slowpath_funcs[] = {
+	tuple_compare_slowpath<false, false, false>,
+	tuple_compare_slowpath<true, false, false>,
+	tuple_compare_slowpath<false, true, false>,
+	tuple_compare_slowpath<true, true, false>,
+	tuple_compare_slowpath<false, false, true>,
+	tuple_compare_slowpath<true, false, true>,
+	tuple_compare_slowpath<false, true, true>,
+	tuple_compare_slowpath<true, true, true>
+};
+
 tuple_compare_t
 tuple_compare_create(const struct key_def *def)
 {
+	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+			   2 * (def->has_optional_parts ? 1 : 0) +
+			   4 * (def->has_json_paths ? 1 : 0);
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts)
 				return tuple_compare_sequential<true, true>;
 			else
 				return tuple_compare_sequential<true, false>;
-		} else if (def->has_optional_parts) {
-			return tuple_compare_slowpath<true, true>;
 		} else {
-			return tuple_compare_slowpath<true, false>;
+			return compare_slowpath_funcs[cmp_func_idx];
 		}
 	}
 	assert(! def->has_optional_parts);
@@ -1044,10 +1086,9 @@ tuple_compare_create(const struct key_def *def)
 				return cmp_arr[k].f;
 		}
 	}
-	if (key_def_is_sequential(def))
-		return tuple_compare_sequential<false, false>;
-	else
-		return tuple_compare_slowpath<false, false>;
+	return key_def_is_sequential(def) ?
+	       tuple_compare_sequential<false, false> :
+	       compare_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare */
@@ -1229,9 +1270,23 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
 
 #undef KEY_COMPARATOR
 
+static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
+	tuple_compare_with_key_slowpath<false, false, false>,
+	tuple_compare_with_key_slowpath<true, false, false>,
+	tuple_compare_with_key_slowpath<false, true, false>,
+	tuple_compare_with_key_slowpath<true, true, false>,
+	tuple_compare_with_key_slowpath<false, false, true>,
+	tuple_compare_with_key_slowpath<true, false, true>,
+	tuple_compare_with_key_slowpath<false, true, true>,
+	tuple_compare_with_key_slowpath<true, true, true>
+};
+
 tuple_compare_with_key_t
 tuple_compare_with_key_create(const struct key_def *def)
 {
+	int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+			   2 * (def->has_optional_parts ? 1 : 0) +
+			   4 * (def->has_json_paths ? 1 : 0);
 	if (def->is_nullable) {
 		if (key_def_is_sequential(def)) {
 			if (def->has_optional_parts) {
@@ -1241,10 +1296,8 @@ tuple_compare_with_key_create(const struct key_def *def)
 				return tuple_compare_with_key_sequential<true,
 									 false>;
 			}
-		} else if (def->has_optional_parts) {
-			return tuple_compare_with_key_slowpath<true, true>;
 		} else {
-			return tuple_compare_with_key_slowpath<true, false>;
+			return compare_with_key_slowpath_funcs[cmp_func_idx];
 		}
 	}
 	assert(! def->has_optional_parts);
@@ -1267,10 +1320,9 @@ tuple_compare_with_key_create(const struct key_def *def)
 				return cmp_wk_arr[k].f;
 		}
 	}
-	if (key_def_is_sequential(def))
-		return tuple_compare_with_key_sequential<false, false>;
-	else
-		return tuple_compare_with_key_slowpath<false, false>;
+	return key_def_is_sequential(def) ?
+	       tuple_compare_with_key_sequential<false, false> :
+	       compare_with_key_slowpath_funcs[cmp_func_idx];
 }
 
 /* }}} tuple_compare_with_key */
diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index c10ee8ce4..78462908d 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -5,13 +5,18 @@
 enum { MSGPACK_NULL = 0xc0 };
 
 /** True if key part i and i+1 are sequential. */
+template <bool has_json_paths>
 static inline bool
 key_def_parts_are_sequential(const struct key_def *def, int i)
 {
 	const struct key_part *part1 = &def->parts[i];
 	const struct key_part *part2 = &def->parts[i + 1];
-	return part1->fieldno + 1 == part2->fieldno &&
-	       part1->path == NULL && part2->path == NULL;
+	if (!has_json_paths) {
+		return part1->fieldno + 1 == part2->fieldno;
+	} else {
+		return part1->fieldno + 1 == part2->fieldno &&
+		       part1->path == NULL && part2->path == NULL;
+	}
 }
 
 /** True, if a key con contain two or more parts in sequence. */
@@ -19,7 +24,7 @@ static bool
 key_def_contains_sequential_parts(const struct key_def *def)
 {
 	for (uint32_t i = 0; i < def->part_count - 1; ++i) {
-		if (key_def_parts_are_sequential(def, i))
+		if (key_def_parts_are_sequential<true>(def, i))
 			return true;
 	}
 	return false;
@@ -99,11 +104,13 @@ tuple_extract_key_sequential(const struct tuple *tuple, struct key_def *key_def,
  * General-purpose implementation of tuple_extract_key()
  * @copydoc tuple_extract_key()
  */
-template <bool contains_sequential_parts, bool has_optional_parts>
+template <bool contains_sequential_parts, bool has_optional_parts,
+	  bool has_json_paths>
 static char *
 tuple_extract_key_slowpath(const struct tuple *tuple,
 			   struct key_def *key_def, uint32_t *key_size)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(contains_sequential_parts ==
@@ -118,9 +125,14 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 
 	/* Calculate the key size. */
 	for (uint32_t i = 0; i < part_count; ++i) {
-		const char *field =
-			tuple_field_by_part_raw(format, data, field_map,
-						&key_def->parts[i]);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, data, field_map,
+						key_def->parts[i].fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, data, field_map,
+							&key_def->parts[i]);
+		}
 		if (has_optional_parts && field == NULL) {
 			bsize += mp_sizeof_nil();
 			continue;
@@ -133,7 +145,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 			 * minimize tuple_field_raw() calls.
 			 */
 			for (; i < part_count - 1; i++) {
-				if (!key_def_parts_are_sequential(key_def, i)) {
+				if (!key_def_parts_are_sequential
+				        <has_json_paths>(key_def, i)) {
 					/*
 					 * End of sequential part.
 					 */
@@ -159,9 +172,14 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 	}
 	char *key_buf = mp_encode_array(key, part_count);
 	for (uint32_t i = 0; i < part_count; ++i) {
-		const char *field =
-			tuple_field_by_part_raw(format, data, field_map,
-						&key_def->parts[i]);
+		const char *field;
+		if (!has_json_paths) {
+			field = tuple_field_raw(format, data, field_map,
+						key_def->parts[i].fieldno);
+		} else {
+			field = tuple_field_by_part_raw(format, data, field_map,
+							&key_def->parts[i]);
+		}
 		if (has_optional_parts && field == NULL) {
 			key_buf = mp_encode_nil(key_buf);
 			continue;
@@ -174,7 +192,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
 			 * minimize tuple_field_raw() calls.
 			 */
 			for (; i < part_count - 1; i++) {
-				if (!key_def_parts_are_sequential(key_def, i)) {
+				if (!key_def_parts_are_sequential
+				        <has_json_paths>(key_def, i)) {
 					/*
 					 * End of sequential part.
 					 */
@@ -207,11 +226,12 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
  * General-purpose version of tuple_extract_key_raw()
  * @copydoc tuple_extract_key_raw()
  */
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 static char *
 tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 			       struct key_def *key_def, uint32_t *key_size)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(!has_optional_parts || key_def->is_nullable);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	assert(mp_sizeof_nil() == 1);
@@ -239,7 +259,8 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 		uint32_t fieldno = key_def->parts[i].fieldno;
 		uint32_t null_count = 0;
 		for (; i < key_def->part_count - 1; i++) {
-			if (!key_def_parts_are_sequential(key_def, i))
+			if (!key_def_parts_are_sequential
+			        <has_json_paths>(key_def, i))
 				break;
 		}
 		const struct key_part *part = &key_def->parts[i];
@@ -314,6 +335,17 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
 	return key;
 }
 
+static const tuple_extract_key_t extract_key_slowpath_funcs[] = {
+	tuple_extract_key_slowpath<false, false, false>,
+	tuple_extract_key_slowpath<true, false, false>,
+	tuple_extract_key_slowpath<false, true, false>,
+	tuple_extract_key_slowpath<true, true, false>,
+	tuple_extract_key_slowpath<false, false, true>,
+	tuple_extract_key_slowpath<true, false, true>,
+	tuple_extract_key_slowpath<false, true, true>,
+	tuple_extract_key_slowpath<true, true, true>
+};
+
 /**
  * Initialize tuple_extract_key() and tuple_extract_key_raw()
  */
@@ -334,32 +366,30 @@ tuple_extract_key_set(struct key_def *key_def)
 				tuple_extract_key_sequential_raw<false>;
 		}
 	} else {
-		if (key_def->has_optional_parts) {
-			assert(key_def->is_nullable);
-			if (key_def_contains_sequential_parts(key_def)) {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<true, true>;
-			} else {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<false, true>;
-			}
-		} else {
-			if (key_def_contains_sequential_parts(key_def)) {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<true, false>;
-			} else {
-				key_def->tuple_extract_key =
-					tuple_extract_key_slowpath<false,
-								   false>;
-			}
-		}
+		int func_idx =
+			(key_def_contains_sequential_parts(key_def) ? 1 : 0) +
+			2 * (key_def->has_optional_parts ? 1 : 0) +
+			4 * (key_def->has_json_paths ? 1 : 0);
+		key_def->tuple_extract_key =
+			extract_key_slowpath_funcs[func_idx];
+		assert(!key_def->has_optional_parts || key_def->is_nullable);
 	}
 	if (key_def->has_optional_parts) {
 		assert(key_def->is_nullable);
-		key_def->tuple_extract_key_raw =
-			tuple_extract_key_slowpath_raw<true>;
+		if (key_def->has_json_paths) {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<true, true>;
+		} else {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<true, false>;
+		}
 	} else {
-		key_def->tuple_extract_key_raw =
-			tuple_extract_key_slowpath_raw<false>;
+		if (key_def->has_json_paths) {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<false, true>;
+		} else {
+			key_def->tuple_extract_key_raw =
+				tuple_extract_key_slowpath_raw<false, false>;
+		}
 	}
 }
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index 3486ce11c..8ede29045 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -213,7 +213,7 @@ static const hasher_signature hash_arr[] = {
 
 #undef HASHER
 
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 uint32_t
 tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def);
 
@@ -256,10 +256,17 @@ tuple_hash_func_set(struct key_def *key_def) {
 	}
 
 slowpath:
-	if (key_def->has_optional_parts)
-		key_def->tuple_hash = tuple_hash_slowpath<true>;
-	else
-		key_def->tuple_hash = tuple_hash_slowpath<false>;
+	if (key_def->has_optional_parts) {
+		if (key_def->has_json_paths)
+			key_def->tuple_hash = tuple_hash_slowpath<true, true>;
+		else
+			key_def->tuple_hash = tuple_hash_slowpath<true, false>;
+	} else {
+		if (key_def->has_json_paths)
+			key_def->tuple_hash = tuple_hash_slowpath<false, true>;
+		else
+			key_def->tuple_hash = tuple_hash_slowpath<false, false>;
+	}
 	key_def->key_hash = key_hash_slowpath;
 }
 
@@ -319,10 +326,11 @@ tuple_hash_key_part(uint32_t *ph1, uint32_t *pcarry, const struct tuple *tuple,
 	return tuple_hash_field(ph1, pcarry, &field, part->coll);
 }
 
-template <bool has_optional_parts>
+template <bool has_optional_parts, bool has_json_paths>
 uint32_t
 tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 {
+	assert(has_json_paths == key_def->has_json_paths);
 	assert(has_optional_parts == key_def->has_optional_parts);
 	uint32_t h = HASH_SEED;
 	uint32_t carry = 0;
@@ -331,9 +339,13 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 	struct tuple_format *format = tuple_format(tuple);
 	const char *tuple_raw = tuple_data(tuple);
 	const uint32_t *field_map = tuple_field_map(tuple);
-	const char *field =
-		tuple_field_by_part_raw(format, tuple_raw, field_map,
-					key_def->parts);
+	const char *field;
+	if (!has_json_paths) {
+		field = tuple_field(tuple, prev_fieldno);
+	} else {
+		field = tuple_field_by_part_raw(format, tuple_raw, field_map,
+						key_def->parts);
+	}
 	const char *end = (char *)tuple + tuple_size(tuple);
 	if (has_optional_parts && field == NULL) {
 		total_size += tuple_hash_null(&h, &carry);
@@ -347,9 +359,18 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
 		 * need of tuple_field
 		 */
 		if (prev_fieldno + 1 != key_def->parts[part_id].fieldno) {
-			struct key_part *part = &key_def->parts[part_id];
-			field = tuple_field_by_part_raw(format, tuple_raw,
-							field_map, part);
+			if (!has_json_paths) {
+				field = tuple_field(tuple,
+						    key_def->parts[part_id].
+						    fieldno);
+			} else {
+				struct key_part *part =
+					&key_def->parts[part_id];
+				field = tuple_field_by_part_raw(format,
+								tuple_raw,
+								field_map,
+								part);
+			}
 		}
 		if (has_optional_parts && (field == NULL || field >= end)) {
 			total_size += tuple_hash_null(&h, &carry);
-- 
2.19.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 1/5] lib: updated msgpuck library version
  2019-01-16 13:44 ` [PATCH v8 1/5] lib: updated msgpuck library version Kirill Shcherbatov
@ 2019-01-23 12:53   ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2019-01-23 12:53 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches, kostja

Subject line should be in imperative mode:

	Update msgpuck library

In the comment it's worth mentioning why you need it.

On Wed, Jan 16, 2019 at 04:44:39PM +0300, Kirill Shcherbatov wrote:
> Needed for #1012
> ---
>  src/lib/msgpuck          |  2 +-
>  test/unit/msgpack.result | 24 +++++++++++++++++++++++-
>  2 files changed, 24 insertions(+), 2 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/5] box: refactor tuple_field_raw_by_path routine
  2019-01-16 13:44 ` [PATCH v8 2/5] box: refactor tuple_field_raw_by_path routine Kirill Shcherbatov
@ 2019-01-23 13:15   ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2019-01-23 13:15 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches, kostja

On Wed, Jan 16, 2019 at 04:44:40PM +0300, Kirill Shcherbatov wrote:
> diff --git a/src/box/tuple.h b/src/box/tuple.h
> index 83e5b7013..4368dac4e 100644
> --- a/src/box/tuple.h
> +++ b/src/box/tuple.h
> @@ -513,6 +513,23 @@ tuple_field(const struct tuple *tuple, uint32_t fieldno)
>  			       tuple_field_map(tuple), fieldno);
>  }
>  
> +/**
> + * Get a field refereed by index @part in tuple.
> + * @param tuple Tuple to get the field from.
> + * @param part Index part to use.

There's no 'part' in this function. Please fix the comment.

> + * @param path Relative JSON path to field data.
> + * @param path_len The length of the @path.
> + * @retval Field data if the field exists or NULL.
> + */
> +static inline const char *
> +tuple_field_by_path(const struct tuple *tuple, uint32_t fieldno,
> +		    const char *path, uint32_t path_len)
> +{
> +	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
> +				       tuple_field_map(tuple), fieldno,
> +				       path, path_len);
> +}
> +
>  /**
>   * Get a field refereed by index @part in tuple.
>   * @param tuple Tuple to get the field from.
> @@ -532,38 +549,16 @@ tuple_field_by_part(const struct tuple *tuple, struct key_part *part)
>   * @param path Field JSON path.
>   * @param path_len Length of @a path.
>   * @param path_hash Hash of @a path.
> - * @param[out] field Found field, or NULL, if not found.
> - *
> - * @retval  0 Success.
> - * @retval -1 Error in JSON path.
> - */
> -static inline int
> -tuple_field_by_path(const struct tuple *tuple, const char *path,
> -                    uint32_t path_len, uint32_t path_hash,
> -                    const char **field)
> -{
> -	return tuple_field_raw_by_path(tuple_format(tuple), tuple_data(tuple),
> -	                               tuple_field_map(tuple), path, path_len,
> -	                               path_hash, field);
> -}
> -
> -/**
> - * Get tuple field by its name.
> - * @param tuple Tuple to get field from.
> - * @param name Field name.
> - * @param name_len Length of @a name.
> - * @param name_hash Hash of @a name.
>   *
> - * @retval not NULL MessagePack field.
> - * @retval     NULL No field with @a name.
> + * @retval field data if field exists or NULL

Missing dot (.)

Please update the comment so that it's clear what's the difference
between tuple_field_by_full_path and tuple_field_path. Don't forget
about tuple_field_raw_by_full_path.

>   */
>  static inline const char *
> -tuple_field_by_name(const struct tuple *tuple, const char *name,
> -		    uint32_t name_len, uint32_t name_hash)
> +tuple_field_by_full_path(const struct tuple *tuple, const char *path,
> +			 uint32_t path_len, uint32_t path_hash)
>  {
> -	return tuple_field_raw_by_name(tuple_format(tuple), tuple_data(tuple),
> -				       tuple_field_map(tuple), name, name_len,
> -				       name_hash);
> +	return tuple_field_raw_by_full_path(tuple_format(tuple), tuple_data(tuple),
> +					    tuple_field_map(tuple),
> +					    path, path_len, path_hash);
>  }
>  
>  /**
> diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
> index 30b93b610..d32bb91f2 100644
> --- a/src/box/tuple_format.h
> +++ b/src/box/tuple_format.h
> @@ -397,27 +397,33 @@ tuple_field_raw(struct tuple_format *format, const char *tuple,
>  }
>  
>  /**
> - * Get tuple field by its name.
> - * @param format Tuple format.
> - * @param tuple MessagePack tuple's body.
> - * @param field_map Tuple field map.
> - * @param name Field name.
> - * @param name_len Length of @a name.
> - * @param name_hash Hash of @a name.
> - *
> - * @retval not NULL MessagePack field.
> - * @retval     NULL No field with @a name.
> + * Retrieve msgpack data by JSON path.
> + * @param data Pointer to msgpack with data.
> + * @param path The path to process.
> + * @param path_len The length of the @path.
> + * @retval 0 On success.
> + * @retval >0 On path parsing error, invalid character position.
> + */
> +int
> +tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len);

The error position returned by this function isn't used anywhere.
Let's make it return -1 on any error in the scope of this patch.

> +
> +/**
> + * Get a field at the specific position in this MessagePack
> + * array by fieldno and path.

All related functions use doxygen comments: see tuple_field_raw,
tuple_field_raw_by_full_path, tuple_field_by_part. Let's write
a proper comment for this one too for the sake of consistency.

>   */
>  static inline const char *
> -tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
> -			const uint32_t *field_map, const char *name,
> -			uint32_t name_len, uint32_t name_hash)
> +tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
> +			const uint32_t *field_map, uint32_t fieldno,
> +			const char *path, uint32_t path_len)
>  {
> -	uint32_t fieldno;
> -	if (tuple_fieldno_by_name(format->dict, name, name_len, name_hash,
> -				  &fieldno) != 0)
> +	tuple = tuple_field_raw(format, tuple, field_map, fieldno);
> +	if (path == NULL)
> +		return tuple;
> +	if (unlikely(tuple == NULL))
>  		return NULL;
> -	return tuple_field_raw(format, tuple, field_map, fieldno);
> +	if (unlikely(tuple_field_go_to_path(&tuple, path, path_len) != 0))
> +		return NULL;
> +	return tuple;
>  }
>  
>  /**

------------------------------------------------------------------------

To explain what I don't like about the refactoring done by this and
following patches, let me paste the resulting code here and comment
on it.

> /**
>  * Get a field at the specific position in this MessagePack array.
>  * Returns a pointer to MessagePack data.
>  * @param format tuple format
>  * @param tuple a pointer to MessagePack array
>  * @param field_map a pointer to the LAST element of field map
>  * @param field_no the index of field to return
>  *
>  * @returns field data if field exists or NULL
>  * @sa tuple_init_field_map()
>  */
> static inline const char *
> tuple_field_raw(struct tuple_format *format, const char *tuple,
> 		const uint32_t *field_map, uint32_t field_no)
> {
> 	if (likely(field_no < format->index_field_count)) {
> 		/* Indexed field */
> 
> 		if (field_no == 0) {
> 			mp_decode_array(&tuple);
> 			return tuple;
> 		}
> 
> 		int32_t offset_slot = tuple_format_field(format,
> 					field_no)->offset_slot;
> 		if (offset_slot != TUPLE_OFFSET_SLOT_NIL) {
> 			if (field_map[offset_slot] != 0)
> 				return tuple + field_map[offset_slot];
> 			else
> 				return NULL;
> 		}

Note field_map manipulation.

> 	}
> 	ERROR_INJECT(ERRINJ_TUPLE_FIELD, return NULL);
> 	uint32_t field_count = mp_decode_array(&tuple);
> 	if (unlikely(field_no >= field_count))
> 		return NULL;
> 	for (uint32_t k = 0; k < field_no; k++)
> 		mp_next(&tuple);
> 	return tuple;
> }
> 
> /**
>  * Retrieve msgpack data by JSON path.
>  * @param data Pointer to msgpack with data.
>  * @param path The path to process.
>  * @param path_len The length of the @path.
>  * @retval 0 On success.
>  * @retval >0 On path parsing error, invalid character position.
>  */
> int
> tuple_field_go_to_path(const char **data, const char *path, uint32_t path_len);
> 
> /**
>  * Get a field at the specific position in this MessagePack
>  * array by fieldno and path.
>  * The offset_slot[out] may be specified to save it on use,
>  * set NULL otherwise.
>  */
> static inline const char *
> tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
> 			const uint32_t *field_map, uint32_t fieldno,
> 			const char *path, uint32_t path_len,
> 			int32_t *offset_slot)
> {
> 	if (offset_slot != NULL)
> 		*offset_slot = TUPLE_OFFSET_SLOT_NIL;
> 	if (likely(path != NULL && fieldno < format->index_field_count)) {
> 		/* Indexed field */
> 		struct tuple_field *field =
> 			tuple_format_field_by_path(format, fieldno,
> 						   path, path_len);
> 		if (field == NULL)
> 			goto parse;
> 		assert(field != NULL);
> 		if (offset_slot != NULL)
> 			*offset_slot = field->offset_slot;
> 		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
> 			assert(-field->offset_slot * sizeof(uint32_t) <=
> 			       format->field_map_size);
> 			return field_map[field->offset_slot] == 0 ?
> 			       NULL : tuple + field_map[field->offset_slot];
> 		}

Note another field_map manipulation, copied from tuple_field_raw.

> 	}
> parse:
> 	tuple = tuple_field_raw(format, tuple, field_map, fieldno);

And here we'll get exactly the same piece of code inlined in this
function...

> 	if (path == NULL)
> 		return tuple;
> 	if (unlikely(tuple == NULL))
> 		return NULL;
> 	if (unlikely(tuple_field_go_to_path(&tuple, path, path_len) != 0))
> 		return NULL;
> 	return tuple;
> }
> 
> /**
>  * Get tuple field by its path.
>  * @param format Tuple format.
>  * @param tuple MessagePack tuple's body.
>  * @param field_map Tuple field map.
>  * @param path Field path.
>  * @param path_len Length of @a path.
>  * @param path_hash Hash of @a path.
>  *
>  * @retval field data if field exists or NULL
>  */
> const char *
> tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
> 			     const uint32_t *field_map, const char *path,
> 			     uint32_t path_len, uint32_t path_hash);
> 
> /**
>  * Get a tuple field pointed to by an index part.
>  * @param format Tuple format.
>  * @param tuple A pointer to MessagePack array.
>  * @param field_map A pointer to the LAST element of field map.
>  * @param part Index part to use.
>  * @retval Field data if the field exists or NULL.
>  */
> static inline const char *
> tuple_field_by_part_raw(struct tuple_format *format, const char *data,
> 			const uint32_t *field_map, struct key_part *part)
> {
> 	if (likely(part->format_epoch == format->epoch)) {
> 		int32_t offset_slot = part->offset_slot_cache;
> 		assert(-offset_slot * sizeof(uint32_t) <=
> 		       format->field_map_size);
> 		return field_map[offset_slot] != 0 ?
> 		       data + field_map[offset_slot] : NULL;

Note yet another field_map manipulation.

It is ugly that we have three places in the code where we use
field_map+offset_slot in exactly the same fashion. What's especially
ugly that this code will be multiplied by the compiler: two times for
tuple_field_raw_by_path and three times for tuple_field_by_part_raw.

I think that

 1. We should implement tuple_field_raw_by_path without tuple_field_raw.
 2. tuple_field_raw should be an alias for tuple_field_raw_by_path with
    path = NULL. That's OK - the compiler will remove unused code then
    so it shouldn't result in any performance penalty.
 3. tuple_field_by_part_raw shouldn't check field_map directly. Instead
    it should pass the cached offset_slot to tuple_field_raw_by_path to
    do the job. I mean offset_slot ptr should become an in-out parameter
    of tuple_field_raw_by_path: if it's not NULL/OFFSET_SLOT_NIL then it
    should use it, otherwise it should try to look up offset_slot in
    tuple_format; if found and offset_slot is not NULL it should update
    the offset_slot with it.

> 	} else {
> 		assert(format->epoch != 0);
> 		int32_t offset_slot;
> 		const char *field =
> 			tuple_field_raw_by_path(format, data, field_map,
> 						part->fieldno, part->path,
> 						part->path_len, &offset_slot);
> 		if (offset_slot != TUPLE_OFFSET_SLOT_NIL) {
> 			part->format_epoch = format->epoch;
> 			part->offset_slot_cache = offset_slot;
> 		}
> 		return field;
> 	}
> }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/5] box: introduce JSON Indexes
  2019-01-16 13:44 ` [PATCH v8 3/5] box: introduce JSON Indexes Kirill Shcherbatov
@ 2019-01-23 13:49   ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2019-01-23 13:49 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches, kostja

On Wed, Jan 16, 2019 at 04:44:41PM +0300, Kirill Shcherbatov wrote:
> diff --git a/src/box/key_def.c b/src/box/key_def.c
> index dae3580e2..0ed497dfc 100644
> --- a/src/box/key_def.c
> +++ b/src/box/key_def.c
> @@ -485,6 +568,14 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
>  				 "index part: unknown sort order");
>  			return -1;
>  		}
> +		if (part->path != NULL &&
> +		    json_path_validate(part->path, strlen(part->path),
> +				       TUPLE_INDEX_BASE) != 0) {
> +			diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
> +				 part->fieldno + TUPLE_INDEX_BASE,

The code above passes (i + TUPLE_INDEX_BASE) to diag_set. Please fix
this place to conform.

> +				 "invalid path");
> +			return -1;
> +		}
>  	}
>  	return 0;
>  }
> diff --git a/src/box/key_def.h b/src/box/key_def.h
> index d1866303b..fe4acffb5 100644
> --- a/src/box/key_def.h
> +++ b/src/box/key_def.h
> @@ -299,11 +320,12 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
>   *  [NUM, STR, ..][NUM, STR, ..]..,
>   *  OR
>   *  {field=NUM, type=STR, ..}{field=NUM, type=STR, ..}..,
> + *  The region is used for allocating JSON paths, if any.

Extra space before "The region is used ...". Makes the comment difficult
to read.

>   */
>  int
>  key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
>  		     const char **data, const struct field_def *fields,
> -		     uint32_t field_count);
> +		     uint32_t field_count, struct region *region);
>  
>  /**
>   * Returns the part in index_def->parts for the specified fieldno.
> diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
> index 61b4c9734..9af9c307a 100644
> --- a/src/box/tuple_format.c
> +++ b/src/box/tuple_format.c
> @@ -69,9 +69,86 @@ static const char *
>  tuple_field_path(const struct tuple_field *field)
>  {
>  	assert(field->token.parent != NULL);
> -	assert(field->token.parent->parent == NULL);
> -	assert(field->token.type == JSON_TOKEN_NUM);
> -	return int2str(field->token.num + TUPLE_INDEX_BASE);
> +	if (field->token.parent->parent == NULL) {
> +		/* Top-level field, no need to format the path. */
> +		return int2str(field->token.num + TUPLE_INDEX_BASE);
> +	}
> +	char *path = tt_static_buf();
> +	json_tree_snprint_path(path, TT_STATIC_BUF_LEN, &field->token,
> +			       TUPLE_INDEX_BASE);
> +	return path;
> +}
> +
> +/**
> + * Given a field number and a path, add the corresponding tuple
> + * field to the tuple format, allocating intermediate fields if
> + * necessary.
> + * Return a pointer to the leaf field on success, NULL on memory

They usually add an empty line as a paragraph delimiter - text looks
more readable that way.

> + * allocation error or type/nullability mistmatch error, diag
> + * message is set.
> + */
> +static struct tuple_field *
> +tuple_format_add_field(struct tuple_format *format, uint32_t fieldno,
> +		       const char *path, uint32_t path_len)
> +{
> +	struct tuple_field *field = NULL;
> +	struct tuple_field *parent = tuple_format_field(format, fieldno);

assert(parent != NULL);

would be useful here to show the reader that top-level fields have
already been created someplace else.

> +	if (path_len == 0)
> +		goto end;

Better 'return parent;' - we don't need to call tuple_field_delete() in
this case.

> +	field = tuple_field_new();
> +	if (field == NULL)
> +		goto fail;
> +
> +	int rc = 0;
> +	uint32_t token_count = 0;
> +	struct json_tree *tree = &format->fields;
> +	struct json_lexer lexer;
> +	json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE);
> +	while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 &&
> +	       field->token.type != JSON_TOKEN_END) {
> +		enum field_type expected_type =
> +			field->token.type == JSON_TOKEN_STR ?
> +			FIELD_TYPE_MAP : FIELD_TYPE_ARRAY;
> +		if (field_type1_contains_type2(parent->type, expected_type)) {
> +			parent->type = expected_type;
> +		} else {
> +			diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
> +				 tuple_field_path(parent),
> +				 field_type_strs[parent->type],
> +				 field_type_strs[expected_type]);
> +			goto fail;
> +		}
> +		struct tuple_field *next =
> +			json_tree_lookup_entry(tree, &parent->token,
> +					       &field->token,
> +					       struct tuple_field, token);
> +		if (next == NULL) {
> +			field->id = format->total_field_count++;
> +			rc = json_tree_add(tree, &parent->token, &field->token);
> +			if (rc != 0) {
> +				diag_set(OutOfMemory, sizeof(struct json_token),
> +					 "json_tree_add", "tree");
> +				goto fail;
> +			}
> +			next = field;
> +			field = tuple_field_new();
> +			if (field == NULL)
> +				goto fail;
> +		}
> +		parent = next;
> +		token_count++;
> +	}
> +	/* Path has been verified by key_def_decode_parts. */
> +	assert(rc == 0 && field->token.type == JSON_TOKEN_END);
> +	assert(parent != NULL);
> +	/* Update tree depth information. */
> +	format->fields_depth = MAX(format->fields_depth, token_count + 1);
> +end:
> +	tuple_field_delete(field);
> +	return parent;
> +fail:
> +	parent = NULL;
> +	goto end;
>  }
>  
>  /**
> @@ -95,15 +172,26 @@ tuple_format_field_by_id(struct tuple_format *format, uint32_t id)
>  static int
>  tuple_format_use_key_part(struct tuple_format *format, uint32_t field_count,
>  			  const struct key_part *part, bool is_sequential,
> -			  int *current_slot)
> +			  int *current_slot, char **path_pool)
>  {
>  	assert(part->fieldno < tuple_format_field_count(format));
> -	struct tuple_field *field = tuple_format_field(format, part->fieldno);
> +	const char *path = *path_pool;
> +	if (part->path != NULL) {
> +		/**
> +		 * Copy JSON path data to reserved area at the
> +		 * end of format allocation.
> +		 */
> +		memcpy(*path_pool, part->path, part->path_len);
> +		*path_pool += part->path_len;
> +	}

This piece should be moved inside tuple_format_add_field().

> +	struct tuple_field *field = tuple_format_add_field(format, part->fieldno,
> +							   path, part->path_len);
> +	if (field == NULL)
> +		return -1;
>  	/*
> -		* If a field is not present in the space format,
> -		* inherit nullable action of the first key part
> -		* referencing it.
> -		*/
> +	 * If a field is not present in the space format, inherit
> +	 * nullable action of the first key part referencing it.
> +	 */
>  	if (part->fieldno >= field_count && !field->is_key_part)
>  		field->nullable_action = part->nullable_action;
>  	/*
> @@ -512,49 +659,79 @@ tuple_init_field_map(struct tuple_format *format, uint32_t *field_map,
>  		/* Empty tuple, nothing to do. */
>  		goto skip;
>  	}
> -	/* first field is simply accessible, so we do not store offset to it */
> -	struct tuple_field *field = tuple_format_field(format, 0);
> -	if (validate &&
> -	    !field_mp_type_is_compatible(field->type, mp_typeof(*pos),
> -					 tuple_field_is_nullable(field))) {
> -		diag_set(ClientError, ER_FIELD_TYPE, tuple_field_path(field),
> -			 field_type_strs[field->type]);
> -		goto error;
> -	}
> -	if (required_fields != NULL)
> -		bit_clear(required_fields, field->id);
> -	mp_next(&pos);
> -	/* other fields...*/
> -	uint32_t i = 1;
>  	uint32_t defined_field_count = MIN(field_count, validate ?
>  					   tuple_format_field_count(format) :
>  					   format->index_field_count);
> -	if (field_count < format->index_field_count) {
> -		/*
> -		 * Nullify field map to be able to detect by 0,
> -		 * which key fields are absent in tuple_field().
> -		 */
> -		memset((char *)field_map - format->field_map_size, 0,
> -		       format->field_map_size);
> -	}
> -	for (; i < defined_field_count; ++i) {
> -		field = tuple_format_field(format, i);
> -		if (validate &&
> -		    !field_mp_type_is_compatible(field->type, mp_typeof(*pos),
> -						 tuple_field_is_nullable(field))) {
> -			diag_set(ClientError, ER_FIELD_TYPE,
> -				 tuple_field_path(field),
> -				 field_type_strs[field->type]);
> -			goto error;
> +	/*
> +	 * Nullify field map to be able to detect by 0,
> +	 * which key fields are absent in tuple_field().
> +	 */
> +	memset((char *)field_map - format->field_map_size, 0,
> +		format->field_map_size);
> +	uint32_t mp_frames_sz = format->fields_depth * sizeof(struct mp_frame);
> +	struct mp_frame *mp_frames = region_alloc(region, mp_frames_sz);
> +	if (mp_frames == NULL) {
> +		diag_set(OutOfMemory, mp_frames_sz, "region", "frames");
> +		goto error;
> +	}
> +	struct mp_stack mp_stack;
> +	mp_stack_init(&mp_stack, format->fields_depth, mp_frames);

Call them simply 'stack' and 'frames'.

> +	mp_stack_push(&mp_stack, MP_ARRAY, defined_field_count);
> +	struct tuple_field *field;
> +	struct json_token *parent = &format->fields.root;
> +	while (1) {
> +		while (mp_stack_advance(&mp_stack)) {
> +			mp_stack_pop(&mp_stack);
> +			if (mp_stack_is_empty(&mp_stack))
> +				goto skip;
> +			parent = parent->parent;
> +		}
> +		struct json_token token;
> +		token.type = mp_stack_top(&mp_stack)->type == MP_ARRAY ?
> +			     JSON_TOKEN_NUM : JSON_TOKEN_STR;
> +		if (token.type == JSON_TOKEN_NUM) {
> +			token.num = mp_stack_top(&mp_stack)->curr - 1;

I don't like that frame->curr index is in fact 1-based. Let's initialize
frame->curr with -1 (or UINT32_MAX) in mp_stack_push() so that the first
mp_stack_advance() called after mp_stack_push() would set it to 0.

Also, what about changing the API slightly?

	mp_stack_advance() -> item index (>= 0) on success
			      -1 if all items have been processed.

Then we wouldn't really need to access mp_frame to get 'curr' so we
could replace mp_stack_top() with mp_stack_type() => MP_MAP/MP_ARRAY.
Not sure if it's a really good idea though. The code would look more
concise, but need to check.

> +		} else if (token.type == JSON_TOKEN_STR) {
> +			if (mp_typeof(*pos) != MP_STR) {
> +				/* Skip key. */
> +				mp_next(&pos);
> +				mp_next(&pos);
> +				continue;
> +			}
> +			token.str = mp_decode_str(&pos, (uint32_t *)&token.len);
> +		} else {
> +			unreachable();
> +		}

The code would look better IMO if you rewrote it without double-if
(first check frame->type to initialize token.type, then check
token->type to initialize token.key):

		switch (mp_stack_top(&stack)->type) {
		case MP_MAP:
			token.type = JSON_TOKEN_STR;
			token.str = ...
			break;
		case MP_ARRAY:
			token.type = JSON_TOKEN_NUM;
			token.num = ...
		default:
			unreachable();
		}

> +		enum mp_type type = mp_typeof(*pos);
> +		assert(parent != NULL);
> +		field = json_tree_lookup_entry(&format->fields, parent, &token,
> +					       struct tuple_field, token);
> +		if (field != NULL) {
> +			bool is_nullable = tuple_field_is_nullable(field);
> +			if (validate &&
> +			    !field_mp_type_is_compatible(field->type, type,
> +							 is_nullable) != 0) {
> +				diag_set(ClientError, ER_FIELD_TYPE,
> +					 tuple_field_path(field),
> +					 field_type_strs[field->type]);
> +				goto error;
> +			}
> +			if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
> +				field_map[field->offset_slot] = pos - tuple;
> +			if (required_fields != NULL)
> +				bit_clear(required_fields, field->id);
>  		}
> -		if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
> -			field_map[field->offset_slot] =
> -				(uint32_t) (pos - tuple);
> +		if ((type == MP_ARRAY || type == MP_MAP) &&
> +		    !mp_stack_is_full(&mp_stack) && field != NULL) {
> +			uint32_t size = type == MP_ARRAY ?
> +					mp_decode_array(&pos) :
> +					mp_decode_map(&pos);
> +			mp_stack_push(&mp_stack, type, size);
> +			parent = &field->token;
> +		} else {
> +			mp_next(&pos);
>  		}
> -		if (required_fields != NULL)
> -			bit_clear(required_fields, field->id);
> -		mp_next(&pos);
> -	}
> +	};

The semicolon (;) isn't needed.

> diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c
> index ddbc2d46f..23b7a5869 100644
> --- a/src/box/vy_point_lookup.c
> +++ b/src/box/vy_point_lookup.c
> @@ -196,7 +196,8 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx,
>  		const struct vy_read_view **rv,
>  		struct tuple *key, struct tuple **ret)
>  {
> -	assert(tuple_field_count(key) >= lsm->cmp_def->part_count);
> +	assert(tuple_field_count(key) >= lsm->cmp_def->part_count ||
> +	       lsm->cmp_def->has_json_paths);

This isn't how this assertion should be fixed - instead we should check
tuple_field_count for keys only (i.e. SELECT statements):

	/* All key parts must be set for a point lookup. */
	assert(vy_stmt_type(key) != IPROTO_SELECT ||
	       tuple_field_count(key) >= lsm->cmp_def->part_count);

>  
> diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua
> new file mode 100644
> index 000000000..a876ff1ed
> --- /dev/null
> +++ b/test/engine/json.test.lua
> @@ -0,0 +1,131 @@
> +test_run = require('test_run').new()
> +engine = test_run:get_cfg('engine')
> +--
> +-- gh-1012: Indexes for JSON-defined paths.
> +--
> +s = box.schema.space.create('withdata', {engine = engine})
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO["fname"]'}, {3, 'str', path = '["FIO"].fname'}}})
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '["FIO"]["fname"]'}}})
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = 'FIO'}}})
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[1]'}}})
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO'}, {3, 'str', path = 'FIO.fname'}}})
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[1].sname'}, {3, 'str', path = '["FIO"].fname'}}})
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO....fname'}}})
> +idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO.fname', is_nullable = false}, {3, 'str', path = '["FIO"]["sname"]'}}})
> +idx ~= nil
> +idx.parts[2].path == 'FIO.fname'
> +format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'array'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
> +s:format(format)
> +format = {{'id', 'unsigned'}, {'meta', 'unsigned'}, {'data', 'map'}, {'age', 'unsigned'}, {'level', 'unsigned'}}
> +s:format(format)
> +s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname'}, {3, 'str', path = '["FIO"]["sname"]'}}})
> +s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
> +s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
> +s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
> +s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
> +s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
> +s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
> +s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
> +idx:select()
> +idx:min()
> +idx:max()
> +s:drop()
> +
> +s = box.schema.create_space('withdata', {engine = engine})
> +parts = {}
> +parts[1] = {1, 'unsigned', path='[2]'}
> +pk = s:create_index('pk', {parts = parts})
> +s:insert{{1, 2}, 3}
> +s:upsert({{box.null, 2}}, {{'+', 2, 5}})
> +s:get(2)
> +s:drop()
> +
> +-- Create index on space with data
> +s = box.schema.space.create('withdata', {engine = engine})
> +pk = s:create_index('primary', { type = 'tree' })
> +s:insert{1, 7, {town = 'London', FIO = 1234}, 4, 5}
> +s:insert{2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
> +s:insert{3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
> +s:insert{4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
> +_ = s:delete(1)
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
> +_ = s:delete(2)
> +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}, {3, 'str', path = '["FIO"]["sname"]'}}})
> +_ = s:delete(4)
> +idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]', is_nullable = true}, {3, 'str', path = '["FIO"]["sname"]'}, {3, 'str', path = '["FIO"]["extra"]', is_nullable = true}}})
> +idx ~= nil
> +s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = '["FIO"]["fname"]'}}})
> +idx2 = s:create_index('test2', {parts = {{2, 'number'}, {3, 'str', path = '["FIO"]["fname"]'}}})
> +idx2 ~= nil
> +t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
> +-- Test field_map in tuple speed-up access by indexed path.
> +t["[3][\"FIO\"][\"fname\"]"]
> +idx:select()
> +idx:min()
> +idx:max()
> +idx:drop()
> +s:drop()
> +
> +-- Test complex JSON indexes
> +s = box.schema.space.create('withdata', {engine = engine})
> +parts = {}
> +parts[1] = {1, 'str', path='[3][2].a'}
> +parts[2] = {1, 'unsigned', path = '[3][1]'}
> +parts[3] = {2, 'str', path = '[2].d[1]'}
> +pk = s:create_index('primary', { type = 'tree', parts =  parts})
> +s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
> +s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
> +parts = {}
> +parts[1] = {4, 'unsigned', path='[1]', is_nullable = false}
> +parts[2] = {4, 'unsigned', path='[2]', is_nullable = true}
> +parts[3] = {4, 'unsigned', path='[4]', is_nullable = true}
> +trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
> +s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
> +parts = {}
> +parts[1] = {1, 'unsigned', path='[3][2].b' }
> +parts[2] = {3, 'unsigned'}
> +crosspart_idx = s:create_index('crosspart', { parts =  parts})
> +s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
> +parts = {}
> +parts[1] = {1, 'unsigned', path='[3][2].b'}
> +num_idx = s:create_index('numeric', {parts =  parts})
> +s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
> +num_idx:get(2)
> +num_idx:select()
> +num_idx:max()
> +num_idx:min()
> +crosspart_idx:max() == num_idx:max()
> +crosspart_idx:min() == num_idx:min()
> +trap_idx:max()
> +trap_idx:min()
> +s:drop()
> +
> +s = box.schema.space.create('withdata', {engine = engine})
> +pk_simplified = s:create_index('primary', { type = 'tree',  parts = {{1, 'unsigned'}}})
> +pk_simplified.path == box.NULL
> +idx = s:create_index('idx', {parts = {{2, 'integer', path = 'a'}}})
> +s:insert{31, {a = 1, aa = -1}}
> +s:insert{22, {a = 2, aa = -2}}
> +s:insert{13, {a = 3, aa = -3}}
> +idx:select()
> +idx:alter({parts = {{2, 'integer', path = 'aa'}}})
> +idx:select()
> +s:drop()
> +
> +-- incompatible format change
> +s = box.schema.space.create('test')
> +i = s:create_index('pk', {parts = {{1, 'integer', path = '[1]'}}})
> +s:insert{{-1}}
> +i:alter{parts = {{1, 'string', path = '[1]'}}}
> +s:insert{{'a'}}
> +i:drop()
> +i = s:create_index('pk', {parts = {{1, 'integer', path = '[1].FIO'}}})
> +s:insert{{{FIO=-1}}}
> +i:alter{parts = {{1, 'integer', path = '[1][1]'}}}
> +i:alter{parts = {{1, 'integer', path = '[1].FIO[1]'}}}
> +s:drop()
> +
> +engine = nil
> +test_run = nil

Needless assignments.

> +

Extra new line.

Please also check recovery, snapshotting, and select by multipart JSON
key (both full and partial). All both for secondary and primary indexes.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tarantool-patches] [PATCH v8 6/6] box: introduce has_json_paths flag in templates
  2019-01-16 15:35 ` [tarantool-patches] [PATCH v8 6/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
@ 2019-01-23 14:15   ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2019-01-23 14:15 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches

On Wed, Jan 16, 2019 at 06:35:56PM +0300, Kirill Shcherbatov wrote:
> diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
> index 7ab6e3bf6..dff3aebbd 100644
> --- a/src/box/tuple_compare.cc
> +++ b/src/box/tuple_compare.cc
> @@ -458,11 +458,12 @@ tuple_common_key_parts(const struct tuple *tuple_a, const struct tuple *tuple_b,
>  	return i;
>  }
>  
> -template<bool is_nullable, bool has_optional_parts>
> +template<bool is_nullable, bool has_optional_parts, bool has_json_path>
>  static inline int
>  tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
>  		       struct key_def *key_def)
>  {
> +	assert(has_json_path == key_def->has_json_paths);

Sometimes you use 'has_json_paths', sometimes 'has_json_path'
(without 's'). Please be consistent.

>  	assert(!has_optional_parts || is_nullable);
>  	assert(is_nullable == key_def->is_nullable);
>  	assert(has_optional_parts == key_def->has_optional_parts);
> @@ -508,10 +509,19 @@ tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
>  		end = part + key_def->part_count;
>  
>  	for (; part < end; part++) {
> -		field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
> -						  field_map_a, part);
> -		field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
> -						  field_map_b, part);
> +		if (!has_json_path) {
> +			field_a = tuple_field_raw(format_a, tuple_a_raw,
> +						  field_map_a,
> +						  part->fieldno);

No need to carry part->fieldno to the next line - it fits.

> +			field_b = tuple_field_raw(format_b, tuple_b_raw,
> +						  field_map_b,
> +						  part->fieldno);

Ditto.

> +		} else {
> +			field_a = tuple_field_by_part_raw(format_a, tuple_a_raw,
> +							  field_map_a, part);
> +			field_b = tuple_field_by_part_raw(format_b, tuple_b_raw,
> +							  field_map_b, part);
> +		}
>  		assert(has_optional_parts ||
>  		       (field_a != NULL && field_b != NULL));
>  		if (! is_nullable) {
> diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
> index c10ee8ce4..78462908d 100644
> --- a/src/box/tuple_extract_key.cc
> +++ b/src/box/tuple_extract_key.cc
> @@ -133,7 +145,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
>  			 * minimize tuple_field_raw() calls.
>  			 */
>  			for (; i < part_count - 1; i++) {
> -				if (!key_def_parts_are_sequential(key_def, i)) {
> +				if (!key_def_parts_are_sequential
> +				        <has_json_paths>(key_def, i)) {

Bad alignment - looks like <has_json_paths>(key_def, i) isn't a part of
'if' condition. Please align by the parenthesis. Here and everywhere
else in the patch.

> diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
> index 3486ce11c..8ede29045 100644
> --- a/src/box/tuple_hash.cc
> +++ b/src/box/tuple_hash.cc
> @@ -347,9 +359,18 @@ tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
>  		 * need of tuple_field
>  		 */
>  		if (prev_fieldno + 1 != key_def->parts[part_id].fieldno) {
> -			struct key_part *part = &key_def->parts[part_id];
> -			field = tuple_field_by_part_raw(format, tuple_raw,
> -							field_map, part);
> +			if (!has_json_paths) {
> +				field = tuple_field(tuple,
> +						    key_def->parts[part_id].
> +						    fieldno);

Please don't carry .fieldno over to the next line - it looks confusing.
If you didn't remove local variable 'part', you would avoid that.

> +			} else {
> +				struct key_part *part =
> +					&key_def->parts[part_id];
> +				field = tuple_field_by_part_raw(format,
> +								tuple_raw,
> +								field_map,
> +								part);
> +			}
>  		}
>  		if (has_optional_parts && (field == NULL || field >= end)) {
>  			total_size += tuple_hash_null(&h, &carry);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 4/5] box: introduce offset_slot cache in key_part
  2019-01-16 13:44 ` [PATCH v8 4/5] box: introduce offset_slot cache in key_part Kirill Shcherbatov
@ 2019-01-23 14:23   ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2019-01-23 14:23 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches, kostja

On Wed, Jan 16, 2019 at 04:44:42PM +0300, Kirill Shcherbatov wrote:
> diff --git a/src/box/key_def.h b/src/box/key_def.h
> index fe4acffb5..7a71ed060 100644
> --- a/src/box/key_def.h
> +++ b/src/box/key_def.h
> @@ -97,6 +97,20 @@ struct key_part {
>  	char *path;
>  	/** The length of JSON path. */
>  	uint32_t path_len;
> +	/**
> +	 * Epoch of the tuple format the offset slot cached in
> +	 * this part is valid for, see tuple_format::epoch.
> +	 */
> +	uint64_t format_epoch;
> +	/**
> +	 * Cached value of the offset slot corresponding to
> +	 * the indexed field (tuple_field::offset_slot).
> +	 * Valid only if key_def::epoch equals the epoch of

key_def::format_epoch

> +	 * the tuple format. This value is updated in
> +	 * tuple_field_by_part_raw to always store the
> +	 * offset corresponding to the last used tuple format.
> +	 */
> +	int32_t offset_slot_cache;
>  };
>  
>  struct key_def;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 5/5] box: specify indexes in user-friendly form
  2019-01-16 13:44 ` [PATCH v8 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
@ 2019-01-23 15:29   ` Vladimir Davydov
  0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Davydov @ 2019-01-23 15:29 UTC (permalink / raw)
  To: Kirill Shcherbatov; +Cc: tarantool-patches, kostja

On Wed, Jan 16, 2019 at 04:44:43PM +0300, Kirill Shcherbatov wrote:
> Implemented a more convenient interface for creating an index
> by JSON path. Instead of specifying fieldno and relative path
> it comes possible to pass full JSON path to data.

it is now possible

> 
> Closes #1012
> 
> @TarantoolBot document
> Title: Indexes by JSON path
> Sometimes field data could have complex document structure.
> When this structure is consistent across whole space,
> you are able to create an index by JSON path.
> 
> Example:
> s = box.schema.space.create('sample')
> format = {{'id', 'unsigned'}, {'data', 'map'}}
> s:format(format)
> -- explicit JSON index creation
> age_idx = s:create_index('age', {{2, 'number', path = "age"}})
> -- user-friendly syntax for JSON index creation
> parts = {{'data.FIO["fname"]', 'str'}, {'data.FIO["sname"]', 'str'},
>          {'data.age', 'number'}}
> info_idx = s:create_index('info', {parts = parts}})
> s:insert({1, {FIO={fname="James", sname="Bond"}, age=35}})
> ---
>  src/box/lua/schema.lua    | 60 ++++++++++++++++++++++++++++++++++-----
>  test/engine/json.result   | 42 +++++++++++++++++++++++++++
>  test/engine/json.test.lua | 12 ++++++++
>  3 files changed, 107 insertions(+), 7 deletions(-)
> 
> diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
> index 8a804f0ba..a55ef5b22 100644
> --- a/src/box/lua/schema.lua
> +++ b/src/box/lua/schema.lua
> @@ -575,6 +575,49 @@ local function update_index_parts_1_6_0(parts)
>      return result
>  end
>  

Please add comments to these functions.

> +local function format_field_index_by_name(format, name)
> +    for k,v in pairs(format) do
> +        if v.name == name then
> +            return k
> +        end
> +    end
> +    return nil
> +end
> +
> +local function format_field_resolve(format, name)
> +    local idx = nil
> +    local field_name = nil
> +    -- try resolve whole name

try to

> +    idx = format_field_index_by_name(format, name)
> +    if idx ~= nil then
> +        print("Case 1 "..idx)

What's this? Debugging prints? Looks like you didn't care to carefully
review your patches before sending them out... again.

> +        return idx, nil
> +    end
> +    -- try resolve [%d]
> +    field_name = string.match(name, "^%[(%d+)%]")
> +    idx = tonumber(field_name)
> +    if idx ~= nil then
> +        print("Case 2 "..idx..string.sub(name, string.len(field_name) + 3))
> +        return idx, string.sub(name, string.len(field_name) + 3)
> +    end
> +    -- try resolve ["%s"] or ['%s']
> +    field_name = string.match(name, "^%[\"(.*)\"%]") or
> +                 string.match(name, "^%[\'(.*)\'%]")

Here you use asterisk (*) while in [%d] case you use plus (+).
I understand there's actually no difference in this particular
case, but please be consistent.

> +    idx = format_field_index_by_name(format, field_name)
> +    if idx ~= nil then
> +        print("Case 3 "..idx..string.sub(name, string.len(field_name) + 5))
> +        return idx, string.sub(name, string.len(field_name) + 5)
> +    end
> +    -- try to resolve .*[ and .*.
> +    field_name = string.match(name, "^([^.%[]-)[.%[]")

And here you use minus (-) for some reason.
Wouldn't simple "^([^.[]+)" work?

> +    idx = format_field_index_by_name(format, field_name)
> +    if idx ~= nil then
> +        print("Case 4 "..idx..string.sub(name, string.len(field_name) + 1))
> +        return idx, string.sub(name, string.len(field_name) + 1)
> +    end
> +    return nil, nil
> +end
> +
>  local function update_index_parts(format, parts)
>      if type(parts) ~= "table" then
>          box.error(box.error.ILLEGAL_PARAMS,
> @@ -626,16 +669,19 @@ local function update_index_parts(format, parts)
>              box.error(box.error.ILLEGAL_PARAMS,
>                        "options.parts[" .. i .. "]: field (name or number) is expected")
>          elseif type(part.field) == 'string' then
> -            for k,v in pairs(format) do
> -                if v.name == part.field then
> -                    part.field = k
> -                    break
> -                end
> +            local idx, path = format_field_resolve(format, part.field)
> +            if idx == nil then
> +               box.error(box.error.ILLEGAL_PARAMS,

Bad formatting - should be 4 spaces after if, not 3.

> +                         "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")

The line is too long. Please split.

>              end
> -            if type(part.field) == 'string' then
> +            if part.path ~= nil and part.path ~= path then
>                  box.error(box.error.ILLEGAL_PARAMS,
> -                          "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")
> +                          "options.parts[" .. i .. "]: field path '"..
> +                          path.." doesn't math the path '" ..part.path.."'")

Sometimes you add spaces around concatenation operator (..), sometimes
you don't. I think I've already repeated it a dozen times, but let me
try again: please be consistent throughout the code.

Anyway, I don't think that there's any point to ensure that part.path
matches the path extracted from the field. The error message looks weird
too. Let's just ignore part.path in this case.

>              end
> +            parts_can_be_simplified = parts_can_be_simplified and path == nil
> +            part.field = idx
> +            part.path = path or part.path
>          elseif part.field == 0 then
>              box.error(box.error.ILLEGAL_PARAMS,
>                        "options.parts[" .. i .. "]: field (number) must be one-based")

Let's please unite all the code doing field resolution in
tuple_field_resolve, including the case when the field is
given as a number. And check that the field is 1-based in
all cases.

> diff --git a/test/engine/json.result b/test/engine/json.result
> index 9e5accb0b..32f787a3d 100644
> --- a/test/engine/json.result
> +++ b/test/engine/json.result
> @@ -70,6 +70,48 @@ s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname
>  - error: Field [3]["FIO"]["fname"] has type 'string' in one index, but type 'number'
>      in another
>  ...
> +s:create_index('test3', {parts = {{2, 'number'}, {']sad.FIO["fname"]', 'str'}}})
> +---
> +- error: 'Illegal parameters, options.parts[2]: field was not found by name '']sad.FIO["fname"]'''
> +...
> +s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str', path = "FIO[\"sname\"]"}}})
> +---
> +- error: 'Illegal parameters, options.parts[2]: field path ''.FIO["fname"] doesn''t
> +    math the path ''FIO["sname"]'''
> +...
> +idx3 = s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str'}}})
> +---
> +...
> +idx3 ~= nil
> +---
> +- true
> +...
> +idx3.parts[2].path == ".FIO[\"fname\"]"
> +---
> +- true
> +...
> +s:create_index('test4', {parts = {{2, 'number'}, {'invalid.FIO["fname"]', 'str'}}})
> +---
> +- error: 'Illegal parameters, options.parts[2]: field was not found by name ''invalid.FIO["fname"]'''
> +...
> +idx4 = s:create_index('test4', {parts = {{2, 'number'}, {'data.FIO["fname"]', 'str'}}})
> +---
> +...
> +idx4 ~= nil
> +---
> +- true
> +...
> +idx4.parts[2].path == ".FIO[\"fname\"]"
> +---
> +- true
> +...
> +-- Vinyl has optimizations that omit index checks, so errors could differ.

What's this about? I don't understand...

> +idx3:drop()
> +---
> +...
> +idx4:drop()
> +---
> +...
>  s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
>  ---
>  - error: 'Tuple field [3]["FIO"] type does not match one required by operation: expected

I'm not planning to review the next version of this patch set for a week
or two so please take your time and perfect it finally unless you want
it to be delayed for another month.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-01-23 15:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
2019-01-16 13:44 ` [PATCH v8 1/5] lib: updated msgpuck library version Kirill Shcherbatov
2019-01-23 12:53   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 2/5] box: refactor tuple_field_raw_by_path routine Kirill Shcherbatov
2019-01-23 13:15   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 3/5] box: introduce JSON Indexes Kirill Shcherbatov
2019-01-23 13:49   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 4/5] box: introduce offset_slot cache in key_part Kirill Shcherbatov
2019-01-23 14:23   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
2019-01-23 15:29   ` Vladimir Davydov
2019-01-16 15:35 ` [tarantool-patches] [PATCH v8 6/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
2019-01-23 14:15   ` Vladimir Davydov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox