From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Sat, 1 Dec 2018 00:28:32 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] [PATCH v5 5/9] box: introduce JSON indexes Message-ID: <20181130212832.nkrqlervcg775dfx@esperanza> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: Kirill Shcherbatov Cc: tarantool-patches@freelists.org, kostja@tarantool.org List-ID: On Mon, Nov 26, 2018 at 01:49:39PM +0300, Kirill Shcherbatov wrote: > New JSON-path-based indexes allows to index documents content. > As we need to store user-defined JSON path in key_part > and key_part_def, we have introduced path and path_len > fields. JSON path is verified and transformed to canonical > form on index msgpack unpack. It is not anymore. > Path string stored as a part of the key_def allocation: > > +-------+---------+-------+---------+-------+-------+-------+ > |key_def|key_part1| ... |key_partN| path1 | pathK | pathN | > +-------+---------+-------+---------+-------+-------+-------+ > | ^ > |-> path _________________| > > With format creation JSON paths are stored at the end of format > allocation: > +------------+------------+-------+------------+-------+ > |tuple_format|tuple_field1| ... |tuple_fieldN| pathK | > +------------+------------+-------+------------+-------+ I wonder what made you put this in the commit message. Is the way paths are allocated that important? I don't think so. OTOH it would be nice to say a few words here about the user API and some subtleties in the implementation, e.g. about tuple_init_field_map. > > Part of #1012 > --- > src/box/errcode.h | 2 +- > src/box/index_def.c | 8 +- > src/box/key_def.c | 164 +++++++++++++--- > src/box/key_def.h | 23 ++- > src/box/lua/space.cc | 5 + > src/box/memtx_engine.c | 3 + > src/box/sql.c | 1 + > src/box/sql/build.c | 1 + > src/box/sql/select.c | 6 +- > src/box/sql/where.c | 1 + > src/box/tuple.c | 38 +--- > src/box/tuple_compare.cc | 13 +- > src/box/tuple_extract_key.cc | 21 ++- > src/box/tuple_format.c | 439 ++++++++++++++++++++++++++++++++++++------- > src/box/tuple_format.h | 38 +++- > src/box/tuple_hash.cc | 2 +- > src/box/vinyl.c | 3 + > src/box/vy_log.c | 3 +- > src/box/vy_point_lookup.c | 2 - > src/box/vy_stmt.c | 166 +++++++++++++--- > test/box/misc.result | 1 + > test/engine/tuple.result | 416 ++++++++++++++++++++++++++++++++++++++++ > test/engine/tuple.test.lua | 121 ++++++++++++ > 23 files changed, 1306 insertions(+), 171 deletions(-) > > diff --git a/src/box/key_def.c b/src/box/key_def.c > index 2119ca3..bc6cecd 100644 > --- a/src/box/key_def.c > +++ b/src/box/key_def.c > @@ -106,13 +111,25 @@ const uint32_t key_mp_type[] = { > struct key_def * > key_def_dup(const struct key_def *src) > { > - size_t sz = key_def_sizeof(src->part_count); > - struct key_def *res = (struct key_def *)malloc(sz); > + const struct key_part *parts = src->parts; > + const struct key_part *parts_end = parts + src->part_count; > + size_t sz = 0; > + for (; parts < parts_end; parts++) > + sz += parts->path != NULL ? parts->path_len + 1 : 0; Why do you require key_part::path to be nul-terminated? It looks strange as you store the string length anyway. Without this requirement, the code would look more straightforward as there would be less branching. E.g. this piece of code would look like: for (uint32_t i = 0; i < src->part_count; i++) sz += src->parts[i].path_len; > + sz = key_def_sizeof(src->part_count, sz); > + struct key_def *res = (struct key_def *)calloc(1, sz); > if (res == NULL) { > diag_set(OutOfMemory, sz, "malloc", "res"); > return NULL; > } > memcpy(res, src, sz); > + /* Update paths to point to the new memory chunk.*/ > + for (uint32_t i = 0; i < src->part_count; i++) { > + if (src->parts[i].path == NULL) > + continue; > + size_t path_offset = src->parts[i].path - (char *)src; > + res->parts[i].path = (char *)res + path_offset; > + } > return res; > } > > @@ -120,8 +137,23 @@ void > key_def_swap(struct key_def *old_def, struct key_def *new_def) > { > assert(old_def->part_count == new_def->part_count); > - for (uint32_t i = 0; i < new_def->part_count; i++) > - SWAP(old_def->parts[i], new_def->parts[i]); > + for (uint32_t i = 0; i < new_def->part_count; i++) { > + if (old_def->parts[i].path == NULL) { > + SWAP(old_def->parts[i], new_def->parts[i]); > + } else { > + /* > + * Since the data is located in memory > + * in the same order (otherwise rebuild > + * would be called), just update the > + * pointers. > + */ > + size_t path_offset = > + old_def->parts[i].path - (char *)old_def; > + SWAP(old_def->parts[i], new_def->parts[i]); > + old_def->parts[i].path = (char *)old_def + path_offset; > + new_def->parts[i].path = (char *)new_def + path_offset; > + } > + } This would be shorter: for (uint32_t i = 0; i < new_def->part_count; i++) { SWAP(old_def->parts[i], new_def->parts[i]); /* * Paths are allocated as a part of key_def so * we need to swap path pointers back - it's OK * as paths aren't supposed to change. */ assert(old_def->parts[i].path_len == new_def->parts[i].path_len); SWAP(old_def->parts[i].path, new_def->parts[i].path); } > SWAP(*old_def, *new_def); > } > > @@ -184,16 +231,23 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count) > } > coll = coll_id->coll; > } > + uint32_t path_len = 0; > + if (part->path != NULL) { > + path_len = strlen(part->path); > + def->parts[i].path = data; > + data += path_len + 1; > + } > key_def_set_part(def, i, part->fieldno, part->type, > part->nullable_action, coll, part->coll_id, > - part->sort_order); > + part->sort_order, part->path, path_len); > } > key_def_set_cmp(def); > return def; > } > > -void > -key_def_dump_parts(const struct key_def *def, struct key_part_def *parts) > +int > +key_def_dump_parts(struct region *pool, const struct key_def *def, > + struct key_part_def *parts) Region should be last in the argument list, because it's just an auxiliary object. We should emphasize on def and parts here. > { > for (uint32_t i = 0; i < def->part_count; i++) { > const struct key_part *part = &def->parts[i]; > @@ -445,6 +539,7 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count, > return key_def_decode_parts_166(parts, part_count, data, > fields, field_count); > } > + struct region *region = &fiber()->gc; I don't think it's worth making key_def.c depend on fiber.h only because we need to use a region here. Let's pass it to key_def_decode_parts in an extra argument as we do in case of key_def_dump_parts. > for (uint32_t i = 0; i < part_count; i++) { > struct key_part_def *part = &parts[i]; > if (mp_typeof(**data) != MP_MAP) { > @@ -595,9 +708,14 @@ key_def_merge(const struct key_def *first, const struct key_def *second) > for (; part != end; part++) { > if (key_def_find(first, part) != NULL) > continue; > + if (part->path != NULL) { > + new_def->parts[pos].path = data; > + data += part->path_len + 1; > + } I counted three places where you initialize part->path before passing it to key_def_set_part. I think it would be nice to move part->path initialization to key_def_set_part. > key_def_set_part(new_def, pos++, part->fieldno, part->type, > part->nullable_action, part->coll, > - part->coll_id, part->sort_order); > + part->coll_id, part->sort_order, part->path, > + part->path_len); > } > key_def_set_cmp(new_def); > return new_def; > diff --git a/src/box/key_def.h b/src/box/key_def.h > index d4da6c5..7731e48 100644 > --- a/src/box/key_def.h > +++ b/src/box/key_def.h > @@ -68,6 +68,8 @@ struct key_part_def { > enum on_conflict_action nullable_action; > /** Part sort order. */ > enum sort_order sort_order; > + /** JSON path to data. */ > + const char *path; It would be nice to make the comment a little bit more thorough: /** * JSON path to indexed data, relative to the field number, * or NULL if this key part indexes a top-level field. */ > }; > > extern const struct key_part_def key_part_def_default; > @@ -86,6 +88,13 @@ struct key_part { > enum on_conflict_action nullable_action; > /** Part sort order. */ > enum sort_order sort_order; > + /** > + * JSON path to data in 'canonical' form. > + * Read json_path_normalize to get more details. > + */ There's no json_path_normalize anymore. Please fix the comment. Also, please mention in the comment whether the string is nul-terminated or not, because the presence of @path_len suggests it isn't. And say a few words on how these paths are allocated. > + char *path; > + /** The length of JSON path. */ > + uint32_t path_len; > }; > > struct key_def; > @@ -260,8 +272,9 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count); > /** > * Dump part definitions of the given key def. > */ > -void > -key_def_dump_parts(const struct key_def *def, struct key_part_def *parts); > +int > +key_def_dump_parts(struct region *pool, const struct key_def *def, > + struct key_part_def *parts); Please update the comment as well - explain what the region is used for, why this function can fail. > > /** > * Update 'has_optional_parts' of @a key_def with correspondence > diff --git a/src/box/sql/select.c b/src/box/sql/select.c > index ca709b4..0734712 100644 > --- a/src/box/sql/select.c > +++ b/src/box/sql/select.c > @@ -1356,6 +1357,9 @@ sql_key_info_new(sqlite3 *db, uint32_t part_count) > struct sql_key_info * > sql_key_info_new_from_key_def(sqlite3 *db, const struct key_def *key_def) > { > + /** SQL key_parts could not have JSON paths. */ > + for (uint32_t i = 0; i < key_def->part_count; i++) > + assert(key_def->parts[i].path == NULL); > struct sql_key_info *key_info = sqlite3DbMallocRawNN(db, > sql_key_info_sizeof(key_def->part_count)); > if (key_info == NULL) { > @@ -1366,7 +1370,7 @@ sql_key_info_new_from_key_def(sqlite3 *db, const struct key_def *key_def) > key_info->key_def = NULL; > key_info->refs = 1; > key_info->part_count = key_def->part_count; > - key_def_dump_parts(key_def, key_info->parts); > + key_def_dump_parts(&fiber()->gc, key_def, key_info->parts); If you passed region=NULL to this function, you wouldn't need the assertion above. > return key_info; > } > > diff --git a/src/box/tuple.c b/src/box/tuple.c > index aae1c3c..62e06e7 100644 > --- a/src/box/tuple.c > +++ b/src/box/tuple.c > @@ -138,38 +138,18 @@ runtime_tuple_delete(struct tuple_format *format, struct tuple *tuple) > int > tuple_validate_raw(struct tuple_format *format, const char *tuple) > { > - if (tuple_format_field_count(format) == 0) > - return 0; /* Nothing to check */ > - > - /* Check to see if the tuple has a sufficient number of fields. */ > - uint32_t field_count = mp_decode_array(&tuple); > - if (format->exact_field_count > 0 && > - format->exact_field_count != field_count) { > - diag_set(ClientError, ER_EXACT_FIELD_COUNT, > - (unsigned) field_count, > - (unsigned) format->exact_field_count); > + struct region *region = &fiber()->gc; > + uint32_t used = region_used(region); > + uint32_t *field_map = region_alloc(region, format->field_map_size); > + if (field_map == NULL) { > + diag_set(OutOfMemory, format->field_map_size, "region_alloc", > + "field_map"); > return -1; > } > - if (unlikely(field_count < format->min_field_count)) { > - diag_set(ClientError, ER_MIN_FIELD_COUNT, > - (unsigned) field_count, > - (unsigned) format->min_field_count); > + field_map = (uint32_t *)((char *)field_map + format->field_map_size); > + if (tuple_init_field_map(format, field_map, tuple, true) != 0) > return -1; > - } > - > - /* Check field types */ > - struct tuple_field *field = tuple_format_field(format, 0); > - uint32_t i = 0; > - uint32_t defined_field_count = > - MIN(field_count, tuple_format_field_count(format)); > - for (; i < defined_field_count; ++i) { > - field = tuple_format_field(format, i); > - if (key_mp_type_validate(field->type, mp_typeof(*tuple), > - ER_FIELD_TYPE, i + TUPLE_INDEX_BASE, > - tuple_field_is_nullable(field))) > - return -1; > - mp_next(&tuple); > - } > + region_truncate(region, used); > return 0; This could be done in a separate patch with a proper justification. > } > > diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc > index e21b009..554c29f 100644 > --- a/src/box/tuple_compare.cc > +++ b/src/box/tuple_compare.cc > @@ -283,6 +285,15 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end, > current_fieldno++; > } > } > + const char *field_last, *field_end_last; > + if (part->path != NULL) { > + field_last = field; > + field_end_last = field_end; > + (void)tuple_field_go_to_path(&field, part->path, > + part->path_len); Please add an assertion so that the reader would clearly see you don't expect an error here. > + field_end = field; > + mp_next(&field_end); > + } > memcpy(key_buf, field, field_end - field); > key_buf += field_end - field; > if (has_optional_parts && null_count != 0) { > diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c > index 92028c5..193d0d8 100644 > --- a/src/box/tuple_format.c > +++ b/src/box/tuple_format.c > @@ -51,7 +52,8 @@ tuple_field_create(struct json_token *token) > ret->offset_slot = TUPLE_OFFSET_SLOT_NIL; > ret->coll_id = COLL_NONE; > ret->nullable_action = ON_CONFLICT_ACTION_NONE; > - ret->token = *token; > + if (token != NULL) > + ret->token = *token; As I mentioned in my review to the previous patch, I think it would be better not to pass token to this function, because you typically pass NULL. Instead set token->type to JSON_TOKEN_END and initialize it properly afterwards. > return ret; > } > > @@ -61,14 +63,114 @@ tuple_field_destroy(struct tuple_field *field) > free(field); > } > > +/** Build a JSON tree path for specified path. */ > +static struct tuple_field * > +tuple_field_tree_add_path(struct tuple_format *format, const char *path, > + uint32_t path_len, uint32_t fieldno) > +{ > + int rc = 0; > + struct json_tree *tree = &format->tree; > + struct tuple_field *parent = tuple_format_field(format, fieldno); > + struct tuple_field *field = tuple_field_create(NULL); > + if (unlikely(field == NULL)) Please don't use likely/unlikely in tuple format constructor - this path is far from hot. > + goto end; > + > + struct json_lexer lexer; > + bool is_last_new = false; > + json_lexer_create(&lexer, path, path_len); > + while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 && > + field->token.key.type != JSON_TOKEN_END) { > + enum field_type iterm_node_type = iterm_node_type? What is that supposed to mean? May be, call it simply expected_type? > + field->token.key.type == JSON_TOKEN_STR ? > + FIELD_TYPE_MAP : FIELD_TYPE_ARRAY; > + if (parent->type != FIELD_TYPE_ANY && > + parent->type != iterm_node_type) { > + const char *name = > + tt_sprintf("[%d]%.*s", fieldno, path_len, path); > + diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH, name, > + field_type_strs[parent->type], > + field_type_strs[iterm_node_type]); > + parent = NULL; > + goto end; > + } > + struct tuple_field *next = > + json_tree_lookup_entry(tree, &parent->token, > + &field->token, > + struct tuple_field, token); > + if (next == NULL) { > + rc = json_tree_add(tree, &parent->token, &field->token); > + if (unlikely(rc != 0)) { > + diag_set(OutOfMemory, sizeof(struct json_token), > + "json_tree_add", "tree"); > + parent = NULL; > + goto end; > + } > + next = field; > + is_last_new = true; > + field = tuple_field_create(NULL); > + if (unlikely(next == NULL)) Looks like you forgot to set 'parent' to NULL here. Let's add a new label, say 'fail', which would set 'parent' to NULL and then jump to 'end'? > + goto end; > + } else { > + is_last_new = false; > + } > + parent->type = iterm_node_type; > + parent = next; > + } > + if (rc != 0 || field->token.key.type != JSON_TOKEN_END) { We break the loop if either 'rc' is not 0 or token.type == JSON_TOKEN_END. Therefore you don't need to check token.type here. > + const char *err_msg = > + tt_sprintf("invalid JSON path '%s': path has invalid " > + "structure (error at position %d)", path, > + rc); > + diag_set(ClientError, ER_WRONG_INDEX_OPTIONS, > + fieldno + TUPLE_INDEX_BASE, err_msg); IMO it would be better to move path validation to key_def_decode_parts - it would be more obvious that way, because we do most of the checks there. Besides, what's the point to go as far as to format creation if we could return an error right when we parsed the input? Moving the check to key_def_decode_parts would mean we had to parse the path one more time, but who cares - index creation is a rare event. If you agree, then please add json_path_validate in the same patch where you add json_path_cmp. > + parent = NULL; > + goto end; > + } > + assert(parent != NULL); > + /* Update tree depth information. */ > + if (is_last_new) { > + uint32_t depth = 1; > + for (struct json_token *iter = parent->token.parent; > + iter != &format->tree.root; iter = iter->parent, ++depth) { > + struct tuple_field *record = > + json_tree_entry(iter, struct tuple_field, > + token); > + record->subtree_depth = > + MAX(record->subtree_depth, depth); Why do you need to maintain subtree_depth for each tuple_field? You never use it after the format is created. To initialize format->subtree_depth you could simply count number of tokens in the path locally and then do format->subtree_depth = MAX(format->subtree_depth, token_count); no? BTW subtree_depth doesn't look like a descriptive name. Perhaps, we should call it max_path_tokens to reflect its true nature and use 0 if there is no JSON fields? > + } > + } > +end: > + tuple_field_destroy(field); > + return parent; > +} > + > static int > tuple_format_use_key_part(struct tuple_format *format, > const struct field_def *fields, uint32_t field_count, > const struct key_part *part, bool is_sequential, > - int *current_slot) > + int *current_slot, char **path_data) > { > assert(part->fieldno < tuple_format_field_count(format)); > struct tuple_field *field = tuple_format_field(format, part->fieldno); > + if (unlikely(part->path != NULL)) { > + assert(!is_sequential); > + /** > + * Copy JSON path data to reserved area at the > + * end of format allocation. > + */ > + memcpy(*path_data, part->path, part->path_len); > + (*path_data)[part->path_len] = '\0'; > + struct tuple_field *root = field; > + field = tuple_field_tree_add_path(format, *path_data, > + part->path_len, > + part->fieldno); > + if (field == NULL) > + return -1; > + format->subtree_depth = > + MAX(format->subtree_depth, root->subtree_depth + 1); > + field->is_key_part = true; Why do you set this flag here? There's code below in this function that does the same. > + *path_data += part->path_len + 1; > + } > /* > * If a field is not present in the space format, > * inherit nullable action of the first key part > @@ -113,7 +215,10 @@ tuple_format_use_key_part(struct tuple_format *format, > field->type)) { > const char *name; > int fieldno = part->fieldno + TUPLE_INDEX_BASE; > - if (part->fieldno >= field_count) { > + if (unlikely(part->path != NULL)) { > + name = tt_sprintf("[%d]%.*s", fieldno, part->path_len, > + part->path); For regular fields we try to find field name when reporting an error (the code is right below), but for JSON fields, we don't. Fix that please. > + } else if (part->fieldno >= field_count) { > name = tt_sprintf("%d", fieldno); > } else { > const struct field_def *def = > @@ -137,10 +242,9 @@ tuple_format_use_key_part(struct tuple_format *format, > * simply accessible, so we don't store an offset for it. > */ > if (field->offset_slot == TUPLE_OFFSET_SLOT_NIL && > - is_sequential == false && part->fieldno > 0) { > - *current_slot = *current_slot - 1; > - field->offset_slot = *current_slot; > - } > + is_sequential == false && > + (part->fieldno > 0 || part->path != NULL)) > + field->offset_slot = (*current_slot = *current_slot - 1); Why did you change this assignment? It looked cleaner before. > return 0; > } > > @@ -377,16 +489,37 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1, > { > if (format1->exact_field_count != format2->exact_field_count) > return false; > - uint32_t format1_field_count = tuple_format_field_count(format1); > - uint32_t format2_field_count = tuple_format_field_count(format2); > - for (uint32_t i = 0; i < format1_field_count; ++i) { > - const struct tuple_field *field1 = > - tuple_format_field(format1, i); > + struct tuple_field *field1; > + struct json_token *field2_prev_token = NULL; > + struct json_token *skip_root_token = NULL; > + struct json_token *field1_prev_token = &format1->tree.root; > + json_tree_foreach_entry_preorder(field1, &format1->tree.root, > + struct tuple_field, token) { > + /* Test if subtree skip is required. */ > + if (skip_root_token != NULL) { > + struct json_token *tmp = &field1->token; > + while (tmp->parent != NULL && > + tmp->parent != skip_root_token) > + tmp = tmp->parent; > + if (tmp->parent == skip_root_token) > + continue; > + } > + skip_root_token = NULL; > + /* Lookup for a valid parent node in new tree. */ > + while (field1_prev_token != field1->token.parent) { > + field1_prev_token = field1_prev_token->parent; > + field2_prev_token = field2_prev_token->parent; > + assert(field1_prev_token != NULL); > + } > + struct tuple_field *field2 = > + json_tree_lookup_entry(&format2->tree, field2_prev_token, > + &field1->token, > + struct tuple_field, token); I fail to understand this code. Can we simplify it somehow? May be, we could just walk over all format1->fields and try to look up the corresponding format2 field? That would be suboptimal, but this function is called only on DDL so understandability is top-prio here. Another option is to try to make this code generic and move it to the json library, augmenting it with some commentary - IMHO it looks too complex for tuple_format.c, but in json.c it would probably be OK, if wrapped properly. > /* > * The field has a data type in format1, but has > * no data type in format2. > */ > - if (i >= format2_field_count) { > + if (field2 == NULL) { > /* > * The field can get a name added > * for it, and this doesn't require a data > @@ -397,13 +530,13 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1, > * NULLs or miss the subject field. > */ > if (field1->type == FIELD_TYPE_ANY && > - tuple_field_is_nullable(field1)) > + tuple_field_is_nullable(field1)) { > + skip_root_token = &field1->token; > continue; > - else > + } else { > return false; > + } > } > - const struct tuple_field *field2 = > - tuple_format_field(format2, i); > if (! field_type1_contains_type2(field1->type, field2->type)) > return false; > /* > @@ -413,10 +546,82 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1, > if (tuple_field_is_nullable(field2) && > !tuple_field_is_nullable(field1)) > return false; > + > + field2_prev_token = &field2->token; > + field1_prev_token = &field1->token; > } > return true; > } > > +/** Find a field in format by offset slot. */ > +static struct tuple_field * > +tuple_field_by_offset_slot(const struct tuple_format *format, > + int32_t offset_slot) > +{ > + struct tuple_field *field; > + struct json_token *root = (struct json_token *)&format->tree.root; > + json_tree_foreach_entry_preorder(field, root, struct tuple_field, > + token) { > + if (field->offset_slot == offset_slot) > + return field; > + } > + return NULL; > +} > + > +/** > + * Verify field_map and raise error on some indexed field has > + * not been initialized. Routine rely on field_map has been > + * initialized with UINT32_MAX marker before field_map > + * initialization. > + */ > +static int > +tuple_field_map_validate(const struct tuple_format *format, uint32_t *field_map) > +{ > + struct json_token *tree_node = (struct json_token *)&format->tree.root; > + /* Lookup for absent not-nullable fields. */ > + int32_t field_map_items = > + (int32_t)(format->field_map_size/sizeof(field_map[0])); > + for (int32_t i = -1; i >= -field_map_items; i--) { > + if (field_map[i] != UINT32_MAX) > + continue; > + > + struct tuple_field *field = > + tuple_field_by_offset_slot(format, i); > + assert(field != NULL); > + /* Lookup for field number in tree. */ > + struct json_token *parent = &field->token; > + while (parent->parent != &format->tree.root) > + parent = parent->parent; > + assert(parent->key.type == JSON_TOKEN_NUM); > + uint32_t fieldno = parent->key.num; > + > + tree_node = &field->token; > + const char *err_msg; > + if (field->token.key.type == JSON_TOKEN_STR) { > + err_msg = tt_sprintf("invalid field %d document " > + "content: map doesn't contain a " > + "key '%.*s' defined in index", > + fieldno, tree_node->key.len, > + tree_node->key.str); > + } else if (field->token.key.type == JSON_TOKEN_NUM) { > + err_msg = tt_sprintf("invalid field %d document " > + "content: array size %d is less " > + "than size %d defined in index", > + fieldno, tree_node->key.num, > + tree_node->parent->child_count); > + } > + diag_set(ClientError, ER_DATA_STRUCTURE_MISMATCH, err_msg); > + return -1; > + } > + return 0; > +} > + > +struct parse_ctx { The struct name is too generic. > + enum json_token_type child_type; > + uint32_t items; > + uint32_t curr; Comments... > +}; > + > /** @sa declaration for details. */ > int > tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map, > @@ -442,44 +647,123 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map, > (unsigned) format->min_field_count); > return -1; > } > - > - /* first field is simply accessible, so we do not store offset to it */ > - enum mp_type mp_type = mp_typeof(*pos); > - const struct tuple_field *field = > - tuple_format_field((struct tuple_format *)format, 0); > - if (validate && > - key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE, > - TUPLE_INDEX_BASE, tuple_field_is_nullable(field))) > - return -1; > - mp_next(&pos); > - /* other fields...*/ > - uint32_t i = 1; > uint32_t defined_field_count = MIN(field_count, validate ? > tuple_format_field_count(format) : > format->index_field_count); > - if (field_count < format->index_field_count) { > - /* > - * Nullify field map to be able to detect by 0, > - * which key fields are absent in tuple_field(). > - */ > - memset((char *)field_map - format->field_map_size, 0, > - format->field_map_size); > - } > - for (; i < defined_field_count; ++i) { > - field = tuple_format_field((struct tuple_format *)format, i); > - mp_type = mp_typeof(*pos); > - if (validate && > - key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE, > - i + TUPLE_INDEX_BASE, > - tuple_field_is_nullable(field))) > - return -1; > - if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) { > - field_map[field->offset_slot] = > - (uint32_t) (pos - tuple); > + /* > + * Fill field_map with marker for toutine Typo: toutine -> routine > + * tuple_field_map_validate to detect absent fields. > + */ > + memset((char *)field_map - format->field_map_size, > + validate ? UINT32_MAX : 0, format->field_map_size); > + > + struct region *region = &fiber()->gc; > + uint32_t mp_stack_items = format->subtree_depth + 1; > + uint32_t mp_stack_size = mp_stack_items * sizeof(struct parse_ctx); > + struct parse_ctx *mp_stack = region_alloc(region, mp_stack_size); > + if (unlikely(mp_stack == NULL)) { > + diag_set(OutOfMemory, mp_stack_size, "region_alloc", > + "mp_stack"); > + return -1; > + } > + mp_stack[0] = (struct parse_ctx){ > + .child_type = JSON_TOKEN_NUM, > + .items = defined_field_count, > + .curr = 0, > + }; > + uint32_t mp_stack_idx = 0; > + struct json_tree *tree = (struct json_tree *)&format->tree; > + struct json_token *parent = &tree->root; > + while (mp_stack[0].curr <= mp_stack[0].items) { > + /* Prepare key for tree lookup. */ > + struct json_token token; > + token.key.type = mp_stack[mp_stack_idx].child_type; > + ++mp_stack[mp_stack_idx].curr; > + if (token.key.type == JSON_TOKEN_NUM) { > + token.key.num = mp_stack[mp_stack_idx].curr; > + } else if (token.key.type == JSON_TOKEN_STR) { > + if (mp_typeof(*pos) != MP_STR) { > + /* > + * We do not support non-string > + * keys in maps. > + */ > + mp_next(&pos); > + mp_next(&pos); > + continue; > + } > + token.key.str = > + mp_decode_str(&pos, (uint32_t *)&token.key.len); > + } else { > + unreachable(); > + } > + struct tuple_field *field = > + json_tree_lookup_entry(tree, parent, &token, > + struct tuple_field, token); > + enum mp_type type = mp_typeof(*pos); > + if (field != NULL) { > + bool is_nullable = tuple_field_is_nullable(field); > + if (validate && > + key_mp_type_validate(field->type, type, > + ER_FIELD_TYPE, > + mp_stack[0].curr, > + is_nullable) != 0) > + return -1; > + if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) { > + field_map[field->offset_slot] = > + (uint32_t)(pos - tuple); > + } > + } > + /* Prepare stack info for next iteration. */ > + if (field != NULL && type == MP_ARRAY && > + mp_stack_idx + 1 < format->subtree_depth) { > + uint32_t size = mp_decode_array(&pos); > + if (unlikely(size == 0)) > + continue; > + parent = &field->token; > + mp_stack[++mp_stack_idx] = (struct parse_ctx){ > + .child_type = JSON_TOKEN_NUM, > + .items = size, > + .curr = 0, > + }; > + } else if (field != NULL && type == MP_MAP && > + mp_stack_idx + 1 < format->subtree_depth) { > + uint32_t size = mp_decode_map(&pos); > + if (unlikely(size == 0)) > + continue; > + parent = &field->token; > + mp_stack[++mp_stack_idx] = (struct parse_ctx){ > + .child_type = JSON_TOKEN_STR, > + .items = size, > + .curr = 0, > + }; > + } else { > + mp_next(&pos); > + while (mp_stack[mp_stack_idx].curr >= > + mp_stack[mp_stack_idx].items) { > + assert(parent != NULL); > + parent = parent->parent; > + if (mp_stack_idx-- == 0) > + goto end; > + } This loop looks sloppy. We need to rewrite it somehow to make it more compact and straightforward. I think it's possible - there's quite a bit of code duplication. May be factor out something in a helper function. And add some comments to make it easier for understanding. > } > - mp_next(&pos); > + }; > +end:; > + /* > + * Field map has already been initialized with zeros when > + * no validation is required. > + */ > + if (!validate) > + return 0; > + struct tuple_field *field; > + struct json_token *root = (struct json_token *)&format->tree.root; > + json_tree_foreach_entry_preorder(field, root, struct tuple_field, > + token) { > + if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL && > + tuple_field_is_nullable(field) && > + field_map[field->offset_slot] == UINT32_MAX) > + field_map[field->offset_slot] = 0; Bad indentation. Anyway, can't we eliminate this extra walk over the JSON tree anyhow? For example, store the initial field map in tuple_format with nullable fields already set to zeros and others set to UINT32_MAX and use it for initialization. > } > - return 0; > + return tuple_field_map_validate(format, field_map); > } > > uint32_t > @@ -731,3 +1007,40 @@ error: > tt_sprintf("error in path on position %d", rc)); > return -1; > } > + > +const char * > +tuple_field_by_part_raw(struct tuple_format *format, const char *data, > + const uint32_t *field_map, struct key_part *part) > +{ > + if (likely(part->path == NULL)) > + return tuple_field_raw(format, data, field_map, part->fieldno); I think we need to make lookup by fieldno inline and do a call only if part->path != NULL. > + > + uint32_t field_count = tuple_format_field_count(format); > + struct tuple_field *root_field = > + likely(part->fieldno < field_count) ? > + tuple_format_field(format, part->fieldno) : NULL; > + struct tuple_field *field = > + unlikely(root_field == NULL) ? NULL: Please stop using likely/unlikely. I struggle to read the code because of them and I don't understand why you use them anyway. > + tuple_format_field_by_path(format, root_field, part->path, > + part->path_len); Can't you move top-level field lookup to tuple_format_field_by_path? Both places in the code that use this function do the lookup by themselves anyway. > + if (unlikely(field == NULL)) { > + /* > + * Legacy tuple having no field map for JSON > + * index require full path parse. Legacy? What does it mean? Improve the comment please. > + */ > + const char *field_raw = > + tuple_field_raw(format, data, field_map, part->fieldno); > + if (unlikely(field_raw == NULL)) > + return NULL; > + if (tuple_field_go_to_path(&field_raw, part->path, > + part->path_len) != 0) tuple_field_go_to_path can't return a non-zero code here. Please add an assertion. > + return NULL; > + return field_raw; > + } > + int32_t offset_slot = field->offset_slot; > + assert(offset_slot < 0); > + assert(-offset_slot * sizeof(uint32_t) <= format->field_map_size); > + if (unlikely(field_map[offset_slot] == 0)) > + return NULL; > + return data + field_map[offset_slot]; Better move this optimistic path above, before the pessimistic path handling "legacy" tuples, because one usually expects optimistic paths to come first in a function. > +} > diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h > index 2da773b..860f052 100644 > --- a/src/box/tuple_format.h > +++ b/src/box/tuple_format.h > @@ -116,6 +116,8 @@ struct tuple_field { > uint32_t coll_id; > /** An JSON entry to organize tree. */ > struct json_token token; > + /** A maximum depth of field subtree. */ > + uint32_t subtree_depth; > }; > > /** > @@ -169,12 +171,16 @@ struct tuple_format { > * index_field_count <= min_field_count <= field_count. > */ > uint32_t min_field_count; > + /** Size of format allocation. */ > + uint32_t allocation_size; I don't see it's used anywhere after format construction. Why do you need it? > /** > * Shared names storage used by all formats of a space. > */ > struct tuple_dictionary *dict; > /** JSON tree of fields. */ > struct json_tree tree; > + /** A maximum depth of fields subtree. */ > + uint32_t subtree_depth; Please add a comment explaining why you need it. > }; > > > diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c > index 7b704b8..9d5e220 100644 > --- a/src/box/vy_point_lookup.c > +++ b/src/box/vy_point_lookup.c > @@ -196,8 +196,6 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx, > const struct vy_read_view **rv, > struct tuple *key, struct tuple **ret) > { > - assert(tuple_field_count(key) >= lsm->cmp_def->part_count); > - I don't like that you simply throw away this assertion. Please think how we can keep it. > *ret = NULL; > double start_time = ev_monotonic_now(loop()); > int rc = 0; > diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c > index 3e60fec..2f35284 100644 > --- a/src/box/vy_stmt.c > +++ b/src/box/vy_stmt.c > @@ -29,6 +29,7 @@ > * SUCH DAMAGE. > */ > > +#include "assoc.h" > #include "vy_stmt.h" > > #include > @@ -370,6 +371,85 @@ vy_stmt_replace_from_upsert(const struct tuple *upsert) > return replace; > } > > +/** > + * Construct tuple or calculate it's size. The fields_iov_ht > + * is a hashtable that links leaf field records of field path > + * tree and iovs that contain raw data. Function also fills the > + * tuple field_map when write_data flag is set true. > + */ > +static void > +vy_stmt_tuple_restore_raw(struct tuple_format *format, char *tuple_raw, > + uint32_t *field_map, char **offset, > + struct mh_i64ptr_t *fields_iov_ht, bool write_data) > +{ > + struct tuple_field *prev = NULL; > + struct tuple_field *curr; > + json_tree_foreach_entry_preorder(curr, &format->tree.root, > + struct tuple_field, token) { > + struct json_token *curr_node = &curr->token; > + struct tuple_field *parent = > + curr_node->parent == NULL ? NULL : > + json_tree_entry(curr_node->parent, struct tuple_field, > + token); > + if (parent != NULL && parent->type == FIELD_TYPE_ARRAY && > + curr_node->sibling_idx > 0) { > + /* > + * Fill unindexed array items with nulls. > + * Gaps size calculated as a difference > + * between sibling nodes. > + */ > + for (uint32_t i = curr_node->sibling_idx - 1; > + curr_node->parent->children[i] == NULL && > + i > 0; i--) { > + *offset = !write_data ? > + (*offset += mp_sizeof_nil()) : > + mp_encode_nil(*offset); > + } > + } else if (parent != NULL && parent->type == FIELD_TYPE_MAP) { > + /* Set map key. */ > + const char *str = curr_node->key.str; > + uint32_t len = curr_node->key.len; > + *offset = !write_data ? > + (*offset += mp_sizeof_str(len)) : > + mp_encode_str(*offset, str, len); > + } > + /* Fill data. */ > + uint32_t children_count = curr_node->child_count; > + if (curr->type == FIELD_TYPE_ARRAY) { > + *offset = !write_data ? > + (*offset += mp_sizeof_array(children_count)) : > + mp_encode_array(*offset, children_count); > + } else if (curr->type == FIELD_TYPE_MAP) { > + *offset = !write_data ? > + (*offset += mp_sizeof_map(children_count)) : > + mp_encode_map(*offset, children_count); > + } else { > + /* Leaf record. */ > + mh_int_t k = mh_i64ptr_find(fields_iov_ht, > + (uint64_t)curr, NULL); > + struct iovec *iov = > + k != mh_end(fields_iov_ht) ? > + mh_i64ptr_node(fields_iov_ht, k)->val : NULL; > + if (iov == NULL) { > + *offset = !write_data ? > + (*offset += mp_sizeof_nil()) : > + mp_encode_nil(*offset); > + } else { > + uint32_t data_offset = *offset - tuple_raw; > + int32_t slot = curr->offset_slot; > + if (write_data) { > + memcpy(*offset, iov->iov_base, > + iov->iov_len); > + if (slot != TUPLE_OFFSET_SLOT_NIL) > + field_map[slot] = data_offset; > + } > + *offset += iov->iov_len; > + } > + } > + prev = curr; > + } > +} > + > static struct tuple * > vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type, > const struct key_def *cmp_def, > @@ -378,51 +458,79 @@ vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type, > /* UPSERT can't be surrogate. */ > assert(type != IPROTO_UPSERT); > struct region *region = &fiber()->gc; > + struct tuple *stmt = NULL; > > uint32_t field_count = format->index_field_count; > - struct iovec *iov = region_alloc(region, sizeof(*iov) * field_count); > + uint32_t part_count = mp_decode_array(&key); > + assert(part_count == cmp_def->part_count); > + struct iovec *iov = region_alloc(region, sizeof(*iov) * part_count); > if (iov == NULL) { > - diag_set(OutOfMemory, sizeof(*iov) * field_count, > - "region", "iov for surrogate key"); > + diag_set(OutOfMemory, sizeof(*iov) * part_count, "region", > + "iov for surrogate key"); > return NULL; > } > - memset(iov, 0, sizeof(*iov) * field_count); > - uint32_t part_count = mp_decode_array(&key); > - assert(part_count == cmp_def->part_count); > - assert(part_count <= field_count); > - uint32_t nulls_count = field_count - cmp_def->part_count; > - uint32_t bsize = mp_sizeof_array(field_count) + > - mp_sizeof_nil() * nulls_count; > - for (uint32_t i = 0; i < part_count; ++i) { > - const struct key_part *part = &cmp_def->parts[i]; > + /* Hastable linking leaf field and corresponding iov. */ > + struct mh_i64ptr_t *fields_iov_ht = mh_i64ptr_new(); Allocating a new hash map per each statement allocation looks very bad. I haven't looked into vy_stmt_tuple_restore_raw yet - I will a bit later and try to figure out if we can actually live without it - but anyway I don't think we could afford it. Please think of a better way to implement this. > + if (fields_iov_ht == NULL) { > + diag_set(OutOfMemory, sizeof(struct mh_i64ptr_t), > + "mh_i64ptr_new", "fields_iov_ht"); > + return NULL; > + } > + if (mh_i64ptr_reserve(fields_iov_ht, part_count, NULL) != 0) { > + diag_set(OutOfMemory, part_count, "mh_i64ptr_reserve", > + "fields_iov_ht"); > + goto end; > + } > + memset(iov, 0, sizeof(*iov) * part_count); > + const struct key_part *part = cmp_def->parts; > + for (uint32_t i = 0; i < part_count; ++i, ++part) { > assert(part->fieldno < field_count); > const char *svp = key; > - iov[part->fieldno].iov_base = (char *) key; > + iov[i].iov_base = (char *) key; > mp_next(&key); > - iov[part->fieldno].iov_len = key - svp; > - bsize += key - svp; > + iov[i].iov_len = key - svp; > + struct tuple_field *field; > + field = tuple_format_field(format, part->fieldno); > + assert(field != NULL); > + if (unlikely(part->path != NULL)) { > + field = tuple_format_field_by_path(format, field, > + part->path, > + part->path_len); > + } > + assert(field != NULL); > + struct mh_i64ptr_node_t node = {(uint64_t)field, &iov[i]}; > + mh_int_t k = mh_i64ptr_put(fields_iov_ht, &node, NULL, NULL); > + if (unlikely(k == mh_end(fields_iov_ht))) { > + diag_set(OutOfMemory, part_count, "mh_i64ptr_put", > + "fields_iov_ht"); > + goto end; > + } > + k = mh_i64ptr_find(fields_iov_ht, (uint64_t)field, NULL); > + assert(k != mh_end(fields_iov_ht)); > } > + /* Calculate tuple size to make allocation. */ > + char *data = NULL; > + vy_stmt_tuple_restore_raw(format, NULL, NULL, &data, fields_iov_ht, > + false); > + uint32_t bsize = mp_sizeof_array(field_count) + data - (char *)NULL; > > - struct tuple *stmt = vy_stmt_alloc(format, bsize); > + stmt = vy_stmt_alloc(format, bsize); > if (stmt == NULL) > - return NULL; > + goto end; > > + /* Construct tuple. */ > char *raw = (char *) tuple_data(stmt); > uint32_t *field_map = (uint32_t *) raw; > + memset((char *)field_map - format->field_map_size, 0, > + format->field_map_size); > char *wpos = mp_encode_array(raw, field_count); > - for (uint32_t i = 0; i < field_count; ++i) { > - const struct tuple_field *field = tuple_format_field(format, i); > - if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) > - field_map[field->offset_slot] = wpos - raw; > - if (iov[i].iov_base == NULL) { > - wpos = mp_encode_nil(wpos); > - } else { > - memcpy(wpos, iov[i].iov_base, iov[i].iov_len); > - wpos += iov[i].iov_len; > - } > - } > - assert(wpos == raw + bsize); > + vy_stmt_tuple_restore_raw(format, raw, field_map, &wpos, fields_iov_ht, > + true); > + > + assert(wpos <= raw + bsize); > vy_stmt_set_type(stmt, type); > +end: > + mh_i64ptr_delete(fields_iov_ht); > return stmt; > } > > diff --git a/test/engine/tuple.result b/test/engine/tuple.result > index 35c700e..322821e 100644 > --- a/test/engine/tuple.result > +++ b/test/engine/tuple.result > @@ -954,6 +954,422 @@ type(tuple:tomap().fourth) > s:drop() > --- > ... > +-- > +-- gh-1012: Indexes for JSON-defined paths. > +-- Add a new file for this test please. > +box.cfg() > +--- > +... > +s = box.schema.space.create('withdata', {engine = engine}) > +--- > +... > +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO["fname"]'}, {3, 'str', path = '["FIO"].fname'}}}) > +--- > +- error: 'Can''t create or modify index ''test1'' in space ''withdata'': same key > + part is indexed twice' > +... > +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '["FIO"]["fname"]'}}}) > +--- > +- error: 'Wrong index options (field 2): ''path'' must be string' > +... > +s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = 'FIO'}}}) > +--- > +- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type > + ''map'' is not supported' > +... > +s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[1]'}}}) > +--- > +- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type > + ''array'' is not supported' > +... > +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO'}, {3, 'str', path = '["FIO"].fname'}}}) > +--- > +- error: Field [2]["FIO"].fname has type 'string' in one index, but type 'map' in > + another > +... If I change the path from '["FIO"].fname' to 'FIO.fname' in the index definition, I will get the following error: error: Field [2]FIO.fname has type 'string' in one index, but type 'map' in another Note, [2]FIO.fname isn't a valid JSON path. Please fix. > +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[1].sname'}, {3, 'str', path = '["FIO"].fname'}}}) > +--- > +- error: Field [2]["FIO"].fname has type 'array' in one index, but type 'map' in another > +... > +s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO....fname'}}}) > +--- > +- error: 'Wrong index options (field 3): invalid JSON path ''FIO....fname'': path > + has invalid structure (error at position 5)' > +... > +idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'FIO.fname'}, {3, 'str', path = '["FIO"]["sname"]'}}}) > +--- > +... > +assert(idx ~= nil) > +--- > +- true > +... > +assert(idx.parts[2].path == "FIO.fname") > +--- > +- true > +... > +s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5} > +--- > +- error: 'Tuple field 3 type does not match one required by operation: expected map' > +... But field 3 does have type 'map'. The error message should clarify that it's not field 3 that has invalid type, that it's path FIO in field 3. > +s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5} > +--- > +- error: 'Tuple field 3 type does not match one required by operation: expected string' > +... > +s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5} > +--- > +- error: 'Tuple doesn''t math document structure: invalid field 3 document content: > + map doesn''t contain a key ''sname'' defined in index' > +... Again, it's not quite clear what's missing, [3].sname or [3].FIO.sname. Please improve this kind of error message as well.