From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounce@freelists.org>
Received: from localhost (localhost [127.0.0.1])
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 6A3CA28C17
	for <tarantool-patches@freelists.org>; Wed,  8 Aug 2018 18:03:15 -0400 (EDT)
Received: from turing.freelists.org ([127.0.0.1])
	by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id jIzv95WZRmMe for <tarantool-patches@freelists.org>;
	Wed,  8 Aug 2018 18:03:15 -0400 (EDT)
Received: from smtp20.mail.ru (smtp20.mail.ru [94.100.179.251])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 8183D28C13
	for <tarantool-patches@freelists.org>; Wed,  8 Aug 2018 18:03:14 -0400 (EDT)
Subject: [tarantool-patches] Re: [PATCH v1 4/5] box: introduce path_hash and
 tuple_field tree
References: <cover.1533558332.git.kshcherbatov@tarantool.org>
 <34eae62c9453063f7ff90ce6fe6a7feba0440ca9.1533558332.git.kshcherbatov@tarantool.org>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <37cd15a8-8060-8e2b-0de1-b478df10c116@tarantool.org>
Date: Thu, 9 Aug 2018 01:03:11 +0300
MIME-Version: 1.0
In-Reply-To: <34eae62c9453063f7ff90ce6fe6a7feba0440ca9.1533558332.git.kshcherbatov@tarantool.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Sender: tarantool-patches-bounce@freelists.org
Errors-to: tarantool-patches-bounce@freelists.org
Reply-To: tarantool-patches@freelists.org
List-help: <mailto:ecartis@freelists.org?Subject=help>
List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: tarantool-patches <tarantool-patches.freelists.org>
List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe>
List-owner: <mailto:>
List-post: <mailto:tarantool-patches@freelists.org>
List-archive: <http://www.freelists.org/archives/tarantool-patches>
To: tarantool-patches@freelists.org, Kirill Shcherbatov <kshcherbatov@tarantool.org>

Thanks for the patch! See 18 comments below.

On 06/08/2018 15:27, Kirill Shcherbatov wrote:
> To work with JSON-defined indexes we introduce format JSON
> path hashtable data_path and a tree of intermediate path
> fields attached to format root fields.
> 
> <Hashtable>: format->data_path
> [2].FIO.fname -> field "fname" {type=str, off_slot=-1}
> [2].FIO.sname -> field "sname" {type=str, off_slot=-2}
> 
> <Tree>:      format->field[2] {type = map}
>                             |
>                     FIO {type = map}
>                             |
>          "fname"            |             "sname"
> {type=str,off_slot=-1} ____|____ {type = str,off_slot=-2}
> 
> Leaf fields used in Index have initialized offset_slot.
> On new tuple creation we observe fields tree and use leaf
> records to init tuple field_map.
> At the same time we use data_path hashtable on tuple data
> access by index(when cached offset_slot is invalid).
> 
> New routine tuple_format_add_json_path is used to construct
> all internal structures for JSON path on new format creation
> and duplicating.
> 
> Part of #1012.
> ---
>   src/box/errcode.h            |   2 +-
>   src/box/index_def.c          |   8 +-
>   src/box/key_def.c            |   4 +-
>   src/box/tuple_extract_key.cc |  23 +-
>   src/box/tuple_format.c       | 531 +++++++++++++++++++++++++++++++++++++++++--
>   src/box/tuple_format.h       |  20 +-
>   test/box/misc.result         |  51 +++--
>   test/engine/tuple.result     | 101 ++++++++
>   test/engine/tuple.test.lua   |  27 +++
>   9 files changed, 704 insertions(+), 63 deletions(-)
> 
> diff --git a/src/box/key_def.c b/src/box/key_def.c
> index 79e07f8..c454377 100644
> --- a/src/box/key_def.c
> +++ b/src/box/key_def.c
> @@ -583,8 +583,8 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
>   			if (node.type != JSON_PATH_NUM) {
>   				diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
>   					part->fieldno,
> -					"invalid JSON path: first part should "
> -					"be defined as array");
> +					"invalid JSON path: first path part "\
> +					"should be defined as array");

1. A commit of a patchset should not fix errors introduced in
the same patchset. Please, merge this hunk into the previous commit.

>   				return -1;
>   			}
>   			if (node.num - TUPLE_INDEX_BASE != part->fieldno) {
> diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
> index bbd87f5..6a2690b 100644
> --- a/src/box/tuple_extract_key.cc
> +++ b/src/box/tuple_extract_key.cc
> @@ -4,14 +4,22 @@
>   
>   enum { MSGPACK_NULL = 0xc0 };
>   
> +static bool
> +key_def_parts_sequential(const struct key_def *def, int i)

2. For flags and flag calculators please use 'is/are'.

> +{
> +	uint32_t fieldno1 = def->parts[i + 1].fieldno;
> +	uint32_t fieldno2 = def->parts[i].fieldno + 1;
> +	return fieldno1 == fieldno2 && def->parts[i].path != NULL &&
> +		def->parts[i + 1].path != NULL;
> +}
> +
>   /** True, if a key con contain two or more parts in sequence. */
>   static bool
>   key_def_contains_sequential_parts(const struct key_def *def)
>   {
> -	for (uint32_t i = 0; i < def->part_count - 1; ++i) {
> -		if (def->parts[i].fieldno + 1 == def->parts[i + 1].fieldno)
> +	for (uint32_t i = 0; i < def->part_count - 1; ++i)
> +		if (key_def_parts_sequential(def, i))
>   			return true;
> -	}

3. The 'for' consists of two lines, so should have {}.

>   	return false;
>   }
>   

4. I found a problem with your keys extractor and tried to
reproduce it with a simple test but got another crash even
earlier than I expected:

Assertion failed: (is_sequential == false), function tuple_format_create, file /Users/v.shpilevoy/Work/Repositories/tarantool/src/box/tuple_format.c, line 509.
Process 93586 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
     frame #0: 0x00007fff56cf5b66 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x7fff56cf5b66 <+10>: jae    0x7fff56cf5b70            ; <+20>
     0x7fff56cf5b68 <+12>: movq   %rax, %rdi
     0x7fff56cf5b6b <+15>: jmp    0x7fff56cecae9            ; cerror_nocancel
     0x7fff56cf5b70 <+20>: retq
Target 0: (tarantool) stopped.

The test is:
	box.cfg{}
	s = box.schema.create_space('test')
	parts = {}
	parts[1] = {'[1][1]', 'unsigned'}
	pk = s:create_index('pk', {parts = parts})

But this is not what I had wanted to test. The initial
problem I found is about how you extract keys: on extract key
slowpath functions (both raw and not) you iterate over
the very first level of a tuple and treat its entire
fields as key parts, but it is wrong now. In tuple key
extractor you should dig into the field when it is a
JSON path part. And this is where you again need the
template parameter 'is_flat' to omit such check for flat
indexes.

The original test I wanted to run was this:

	box.cfg{}
	s = box.schema.create_space('test')
	parts = {}
	parts[1] = {'[1][1]', 'unsigned'}
	pk = s:create_index('pk', {parts = parts})
	s:insert{{1, 1}}
	s:upsert({{1}}, {{'+', 2, 1}})

Here, I think, upsert will try to extract key 1, but
will actually extract entire {1} and crash in some place.

> diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
> index 40cd48a..f03978d 100644
> --- a/src/box/tuple_format.c
> +++ b/src/box/tuple_format.c
> @@ -32,6 +32,34 @@
>   #include "tuple.h"
>   #include "tuple_format.h"
>   
> +path_hash_f path_hash;
> +
> +#define mh_name _field
> +struct mh_field_key_t {
> +	const char *path;
> +	uint32_t path_len;
> +	uint32_t path_hash;
> +};
> +#define mh_key_t struct mh_field_key_t *
> +
> +struct mh_field_node_t {
> +	const char *path;
> +	uint32_t path_len;
> +	uint32_t path_hash;
> +	struct tuple_field *field;
> +};> +#define mh_node_t struct mh_field_node_t
> +
> +#define mh_arg_t void *
> +#define mh_hash(a, arg) ((a)->path_hash)
> +#define mh_hash_key(a, arg) mh_hash(a, arg)
> +#define mh_cmp(a, b, arg) ((a)->path_len != (b)->path_len || \
> +			   memcmp((a)->path, (b)->path, (a)->path_len))
> +#define mh_cmp_key(a, b, arg) mh_cmp(a, b, arg)
> +#define MH_SOURCE 1
> +#include "salad/mhash.h" /* Create mh_field_t hash. */
> +

5. Exactly the same hash already exists in assoc.h as mh_strnptr, as
I said you verbally.

> +
>   /** Global table of tuple formats */
>   struct tuple_format **tuple_formats;
>   static intptr_t recycled_format_ids = FORMAT_ID_NIL;
> @@ -41,10 +69,347 @@ static uint64_t format_epoch = 1;
>   static uint32_t formats_size = 0, formats_capacity = 0;
>   
>   static const struct tuple_field tuple_field_default = {
> -	FIELD_TYPE_ANY, TUPLE_OFFSET_SLOT_NIL, false, false,
> +	FIELD_TYPE_ANY, TUPLE_OFFSET_SLOT_NIL, false, false, {{NULL, 0}, NULL}
>   };
>   
>   /**
> + * Propagate @a field to MessagePack(field)[index].
> + * @param[in][out] field Field to propagate.
> + * @param index 1-based index to propagate to.
> + *
> + * @retval  0 Success, the index was found.
> + * @retval -1 Not found.
> + */
> +static inline int
> +tuple_field_go_to_index(const char **field, uint64_t index);
> +
> +/**
> + * Propagate @a field to MessagePack(field)[key].
> + * @param[in][out] field Field to propagate.
> + * @param key Key to propagate to.
> + * @param len Length of @a key.
> + *
> + * @retval  0 Success, the index was found.
> + * @retval -1 Not found.
> + */
> +static inline int
> +tuple_field_go_to_key(const char **field, const char *key, int len);

6. You do not need to announce these functions. Just move the
implementation here.

> +
> +/**
> + * Get @hashtable record by key @path, @path_len.
> + * @param hashtable to lookup,
> + * @param path string.
> + * @param path_len length of @path.
> + * @retval NULL on nothing found.
> + * @retval hashtable record pointer.
> + */
> +static struct mh_field_node_t *
> +json_path_hash_get(struct mh_field_t *hashtable, const char *path,
> +		   uint32_t path_len)
> +{
> +	assert(hashtable != NULL);
> +	uint32_t hash = field_name_hash(path, path_len);
> +	struct mh_field_key_t key = {path, path_len, hash};
> +	mh_int_t rc = mh_field_find(hashtable, &key, NULL);
> +	if (rc == mh_end(hashtable))
> +		return NULL;
> +	return mh_field_node(hashtable, rc);
> +}

7. Use mh_strnptr_find_inp.

> +
> +/**
> + * Construct field tree level for JSON path part.
> + *
> + * @param[in, out] tuple_field pointer to record to start with
> + *                 would be changed to record that math
> + *                 @part lexeme.
> + * @param fieldno number of root space field.
> + * @param part JSON path lexeme to represent in field tree.
> + * @retval -1 on error.
> + * @retval 0 on success.
> + */
> +static int
> +json_field_tree_append(struct tuple_field **field_subtree, uint32_t fieldno,
> +		       struct json_path_node *part)
> +{
> +	enum field_type type;
> +	struct tuple_field *field = *field_subtree;
> +	switch (part->type) {
> +	case JSON_PATH_NUM: {
> +		type = FIELD_TYPE_ARRAY;
> +		if (field->type != FIELD_TYPE_ANY && field->type != type)
> +			goto error_type_mistmatch;
> +		field->type = type;
> +		/* Create or resize field array if required. */
> +		if (field->array == NULL) {
> +			field->array =
> +				calloc(part->num, sizeof(struct tuple_field *));
> +			if (field->array == NULL) {
> +				diag_set(OutOfMemory,
> +					part->num * sizeof(struct tuple_field *),
> +					"malloc", "field->array");
> +				return -1;
> +			}
> +			field->type = type;
> +			field->array_size = part->num;
> +		} else if (part->num >= field->array_size) {
> +			void *array =
> +				realloc(field->array,
> +					part->num * sizeof(struct tuple_field *));

8. Realloc works ok with NULL in the first argument, so you
can merge this if-else branching. If part->num > size then realloc
+ memset of the gap. No special calloc.

> +			if (array == NULL) {
> +				diag_set(OutOfMemory,
> +					sizeof(struct tuple_field *), "realloc",
> +					"array");
> +				return -1;
> +			}
> +			memset(field->array[field->array_size], 0,
> +				part->num - field->array_size);
> +			field->array = array;
> +			field->array_size = part->num;
> +		}
> +		/* Record already exists. No actions required */
> +		if (field->array[part->num - TUPLE_INDEX_BASE] != NULL) {
> +			*field_subtree =
> +				field->array[part->num - TUPLE_INDEX_BASE];
> +			return 0;
> +		}
> +		break;
> +	}
> +	case JSON_PATH_STR: {
> +		type = FIELD_TYPE_MAP;
> +		if (field->type != FIELD_TYPE_ANY && field->type != type)
> +			goto error_type_mistmatch;
> +		field->type = type;
> +		if (field->map == NULL &&
> +		    json_path_hash_create(&field->map) != 0)
> +			return -1;
> +		struct mh_field_node_t *node =
> +			json_path_hash_get(field->map, part->str, part->len);
> +		if (node != NULL) {
> +			*field_subtree = node->field;
> +			return 0;
> +		}
> +		break;
> +	}
> +	default:
> +		unreachable();
> +	}
> +
> +	/* Construct and insert a new record. */
> +	struct tuple_field *new_field =
> +		malloc(sizeof(struct tuple_field));
> +	if (new_field == NULL) {
> +		diag_set(OutOfMemory, sizeof(struct tuple_field), "malloc",
> +			"new_field");
> +		return -1;
> +	}
> +	*new_field = tuple_field_default;
> +	if (field->type == FIELD_TYPE_MAP) {
> +		if (json_path_hash_insert(field->map, part->str, part->len,
> +					  new_field) != 0) {
> +			free(new_field);
> +			return -1;
> +		}
> +	} else if (field->type == FIELD_TYPE_ARRAY) {
> +		field->array[part->num - TUPLE_INDEX_BASE] = new_field;
> +	}
> +	*field_subtree = new_field;
> +
> +	return 0;
> +
> +error_type_mistmatch:
> +	diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
> +		tt_sprintf("%d", fieldno + TUPLE_INDEX_BASE),
> +		field_type_strs[type], field_type_strs[field->type]);
> +	return -1;
> +}
> +
> +/**
> + * Add new JSON @path to @format.
> + * @param format to modify.
> + * @param path string to add.
> + * @param path_len length of @path.
> + * @param field_type type of field by @path.
> + * @param[out] leaf_field pointer to leaf field.
> + */
> +static int
> +tuple_format_add_json_path(struct tuple_format *format, const char *path,
> +			   uint32_t path_len, enum field_type type,
> +			   struct tuple_field **leaf_field)
> +{
> +
> +	struct mh_field_node_t *leaf_node = NULL;
> +	if (format->path_hash == NULL) {
> +		/* Initialize path hashtable. */
> +		if (json_path_hash_create(&format->path_hash) != 0)

9. Please, preallocate the needed hash elements count once, using
mh reserve() function.

> +			return -1;
> +	}
> +
> +	/*
> +	 * Get root field by index.
> +	 * Path is specified in canonical form: [i]...
> +	 */
> +	int rc = 0;
> +	struct json_path_parser parser;
> +	struct json_path_node node;
> +	json_path_parser_create(&parser, path, path_len);
> +	rc = json_path_next(&parser, &node);
> +	assert(rc == 0 && node.type == JSON_PATH_NUM);
> +	assert(node.num < format->field_count + 1);
> +	uint32_t root_fieldno = node.num - TUPLE_INDEX_BASE;
> +	struct tuple_field *field =  &format->fields[root_fieldno];
> +
> +	/* Test if path is already registered. */
> +	if ((leaf_node = json_path_hash_get(format->path_hash, path,
> +					    path_len)) != NULL) {
> +		if (leaf_node->field->type != type) {
> +			const char *err =
> +				tt_sprintf("JSON path '%.*s' has been already "
> +					   "constructed for '%s' leaf record",
> +					   path_len, path,
> +					   field_type_strs[
> +						leaf_node->field->type]);
> +			diag_set(ClientError,  ER_WRONG_INDEX_OPTIONS,
> +				node.num, err);
> +			return -1;
> +		}
> +		*leaf_field = leaf_node->field;
> +		/* Path already registered. */
> +		return 0;
> +	}
> +
> +	/*
> +	 * Hash table would hold string that path tree
> +	 * would refer to.
> +	 */
> +	if ((path = strdup(path)) == NULL) {

10. Please, allocate json paths memory in the same block as the format.
See tuple_format_alloc. This allows to do not do strdups here, do not
use a 'free_path' flag in json_path_hash_delete, and reduces memory
fragmentation.

> +		diag_set(OutOfMemory, path_len + 1, "strdup", "path");
> +		return -1;
> +	}
> +
> +	/* Initialize dummy path_hash record. */
> +	if (json_path_hash_insert(format->path_hash, path, path_len,
> +				  NULL) != 0) {
> +		free((void *)path);
> +		return -1;
> +	}
> +
> +	/* Build data path tree. */
> +	while ((rc = json_path_next(&parser, &node)) == 0 &&
> +		node.type != JSON_PATH_END) {
> +		if (json_field_tree_append(&field, root_fieldno, &node) != 0)
> +			return -1;
> +	}
> +	/* It should be canonical valid JSON. */
> +	assert(rc == 0 && node.type == JSON_PATH_END);
> +	/* Leaf record is a new object as JSON path unique. */
> +	field->type = type;
> +
> +	/* Finish hashtable record init. */
> +	leaf_node = json_path_hash_get(format->path_hash, path, path_len);
> +	assert(leaf_node != NULL && leaf_node->field == NULL);
> +	leaf_node->field = field;
> +
> +	*leaf_field = field;
> +	return 0;
> +}
> +
> +/**
>    * Extract all available type info from keys and field
>    * definitions.
>    */
> @@ -354,6 +770,59 @@ tuple_format_dup(struct tuple_format *src)
>   		return NULL;
>   	}
>   	return format;
> +
> +error:
> +	tuple_format_destroy(format);
> +	return NULL;
> +}
> +
> +static int
> +tuple_init_json_field_map(const struct tuple_field *field, uint32_t idx,
> +			  uint32_t *field_map, const char *tuple,
> +			  const char *offset)
> +{
> +	if (field->type == FIELD_TYPE_MAP) {
> +		mh_int_t i;
> +		mh_foreach(field->map, i) {
> +			struct mh_field_node_t *node =
> +				mh_field_node(field->map, i);
> +			const char *raw = offset;
> +			if (tuple_field_go_to_key(&raw, node->path,
> +						  node->path_len) != 0) {
> +				diag_set(ClientError,
> +					ER_DATA_MISMATCH_INDEX_PART);
> +				return -1;
> +			}
> +			if (tuple_init_json_field_map(node->field, idx,
> +						      field_map, tuple,
> +						      raw) != 0)
> +				return -1;
> +		}
> +	} else if (field->type == FIELD_TYPE_ARRAY) {
> +		for (uint32_t i = 0; i < field->array_size; i++) {
> +			struct tuple_field *field = field->array[i];
> +			if (field == NULL)
> +				continue;
> +			const char *raw = offset;
> +			if (tuple_field_go_to_index(&raw, i) != 0) {
> +				diag_set(ClientError,
> +					ER_DATA_MISMATCH_INDEX_PART);
> +				return -1;
> +			}

11. When a field is array, you can iterate over its fields and check
if corresponding field->array[i] is not null. This allows to iterate
over the array only once, while now you rescan it O(N) times where
N is size of the field->array.

12. For the map above you can do another thing. Each time you do
go_to_key() you can remember the key index (number of iterations
inside go_to_key()) and the result pointer. During the cycle you
calculate the maximal key index and its pointer. And then you
finish the field decoding starting from this point. This is a
strong optimization. For example, now for such field

{a = {b = {c = d, e = f}, g = h}}

and key in a.b.c you decode it as

{a = {b = {c = d, e = f}, g = h}}
{b = {c = d, e = f}
{c = d, e = f}

This is because after you found the key you still need to
skip the entire map field to find the next one (see the next
comment).

With my method you decode it as

{a = {b = {c = d, e = f}, g = h}}

> +			if (tuple_init_json_field_map(field, idx, field_map,
> +						      tuple, raw) != 0)
> +				return -1;
> +		}
> +	} else {
> +		/* Leaf field node. */
> +		assert(field->offset_slot != TUPLE_OFFSET_SLOT_NIL);
> +		if (key_mp_type_validate(field->type, mp_typeof(*offset),
> +					 ER_KEY_PART_TYPE, idx,
> +					 field->is_nullable) != 0)
> +			return -1;
> +		field_map[field->offset_slot] = (uint32_t)(offset - tuple);
> +	}
> +	return 0;
>   }
>   
>   /** @sa declaration for details. */
> @@ -411,6 +880,12 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
>   			field_map[field->offset_slot] =
>   				(uint32_t) (pos - tuple);
>   		}
> +		if (field->map != NULL) {
> +			assert(field->array != NULL);
> +			if (tuple_init_json_field_map(field, i, field_map,
> +						      tuple, pos) != 0)

13. Now you rescan the field pointed by pos twice - first
time inside tuple_init_json_field_map, second in the mp_next below.
You should not.

> +				return -1;
> +		}
>   		mp_next(&pos);
>   	}
>   	return 0;
> @@ -565,17 +1023,33 @@ tuple_field_by_part(const struct tuple *tuple, const struct key_def *def,
>   		assert(format->epoch > 0);
>   		int32_t offset_slot = TUPLE_OFFSET_SLOT_NIL;
>   		if (part->format_epoch != format->epoch) {
> -			offset_slot = format->fields[field_no].offset_slot;
> +			/* Cache miss. */
> +			if (part->path != NULL) {
> +				struct mh_field_t *ht = format->path_hash;
> +				struct mh_field_node_t *node =
> +					json_path_hash_get(ht, part->path,
> +							   part->path_len);

14. As I said you verbally, instead of recalculating hash from
the path you can store it in the key_part together with slot cache
and epoch. And use light version of json_path_hash_get which takes
a path with already calculated hash.

> +				assert(node != NULL);
> +				offset_slot = node->field->offset_slot;
> +			} else {
> +				offset_slot =
> +					format->fields[field_no].offset_slot;
> +			}
> +			/* Update cache only for greater epoch. */
>   			if (format->epoch > part->format_epoch) {
>   				part->slot_cache = offset_slot;
>   				part->format_epoch = format->epoch;
>   			}
> +		} else {
> +			/* Cache hit. */
> +			offset_slot = part->slot_cache;
>   		}
>   		if (offset_slot != TUPLE_OFFSET_SLOT_NIL) {
>   			return field_map[offset_slot] != 0 ?
>   			       data + field_map[offset_slot] : NULL;
>   		}
>   	}
> diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
> index 3cb1284..1154f74 100644
> --- a/src/box/tuple_format.h
> +++ b/src/box/tuple_format.h
> @@ -63,6 +63,7 @@ enum { TUPLE_OFFSET_SLOT_NIL = INT32_MAX };
>   
>   struct tuple;
>   struct tuple_format;
> +struct tuple_field_map;

15. What is it?

>   
>   /** Engine-specific tuple format methods. */
>   struct tuple_format_vtab {
> @@ -81,6 +82,11 @@ struct tuple_format_vtab {
>   	             const char *end);
>   };
>   
> +struct mh_field_t;
> +typedef size_t (*path_hash_f)(const char *str, uint32_t len);
> +extern path_hash_f path_hash;

16. What is it? I do not see where is it
assigned or used.

> +
> +
>   /** Tuple field meta information for tuple_format. */
>   struct tuple_field {
>   	/**
> @@ -92,7 +98,7 @@ struct tuple_field {
>   	 */
>   	enum field_type type;
>   	/**
> -	 * Offset slot in field map in tuple. Normally tuple
> +	 * Offset slot ix`n field map in tuple. Normally tuple

17. !?

>   	 * stores field map - offsets of all fields participating
>   	 * in indexes. This allows quick access to most used
>   	 * fields without parsing entire mspack. This member
> diff --git a/test/engine/tuple.test.lua b/test/engine/tuple.test.lua
> index 30d6f1a..e106dd0 100644
> --- a/test/engine/tuple.test.lua
> +++ b/test/engine/tuple.test.lua
> @@ -289,5 +289,32 @@ t["{"]
>   
>   s:drop()
>   
> +--
> +-- gh-1012: Indexes for JSON-defined paths.
> +--
> +s_withdata = box.schema.space.create('withdata', {engine = engine})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO["fname"]'}, {3, 'str', path = '[3]["FIO"].fname'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'field.FIO.fname'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = '[3].FIO'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[3][1]'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO'}, {3, 'str', path = '[3]["FIO"].fname'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3][1].sname'}, {3, 'str', path = '[3]["FIO"].fname'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[2].FIO.fname'}}})
> +s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO....fname'}}})
> +idx = s_withdata:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
> +assert(idx ~= nil)
> +s_withdata:insert{7, 7, {town = 'NY', FIO = 666}, 4, 5}
> +s_withdata:insert{7, 7, {town = 'NY', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
> +s_withdata:insert{7, 7, {town = 'NY', FIO = {fname = "James"}}, 4, 5}
> +s_withdata:insert{7, 7, {town = 'NY', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
> +s_withdata:insert{7, 7, {town = 'NY', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
> +s_withdata:insert{7, 7, {town = 'NY', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
> +s_withdata:insert{7, 7, {town = 'NY', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
> +idx:select()
> +idx:min()
> +idx:max()
> +s_withdata:drop()

18. Sorry, but it is too little number of tests. You even did not
test primary JSON index. As well as key extractions (upsert, vinyl replace/update etc),
tuple hashes (via dumps in vinyl for example) etc.