From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Kirill Shcherbatov Subject: [PATCH v4 3/3] box: introduce multikey indexes in memtx Date: Fri, 19 Apr 2019 17:14:25 +0300 Message-Id: In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit To: tarantool-patches@freelists.org, vdavydov.dev@gmail.com Cc: kostja@tarantool.org, Kirill Shcherbatov List-ID: - In the case of multikey index arises an ambiguity: which key should be used in the comparison. The previously introduced comparison hints act as an non-negative numeric index of key to use, - Memtx B+ tree replace and build_next methods have been patched to insert the same tuple multiple times by different logical indexes of the key in the array, - Map fields have been expanded service areas "extent" that contain an offset of multikey index keys by additional logical index. Part of #1257 @TarantoolBot document Title: introduce multikey indexes in memtx Any JSON index in which at least one partition contains "[*]" - array index placeholder sign is called "Multikey". Such indexes allows you to automatically index set of documents having same document structure. Multikey indexes design have a number of restrictions that must be taken into account: - it cannot be primary because of the ambiguity arising from it's definition (primary index requires the one unique key that identify tuple), - if some node in the JSON tree of all defined indexes contains an array index placeholder [*], no other JSON path can use an explicit JSON index on it's nested field. - it support "unique" semantics, but it's uniqueness a little different from conventional indexes: you may insert a tuple in which the same key occurs multiple times in a unique multikey index, but you cannot insert a tuple when any of its keys is in some other tuple stored in space, - the unique multikey index "duplicate" conflict occurs when the sets of extracted keys have a non-empty logical intersection - to identify the different keys by which a given data tuple is indexed, each key is assigned a logical sequence number in the array defined with array index placeholder [*] in index (such array is called multikey index root), - no index partition can contain more than one array index placeholder sign [*] in it's JSON path, - all parts containing JSON paths with array index placeholder [*] must have the same (in terms of json tokens) prefix before this placeholder sign. Example: s = box.schema.space.create('withdata') pk = s:create_index('pk') parts = { {2, 'str', path = 'data[*].name'}, {2, 'str', path = 'data[*].extra.phone'} } idx = s:create_index('idx', {parts = parts}) s:insert({1, {data = {{name="A", extra={phone="111"}}, {name="B", extra={phone="111"}}}, garbage = 1}} idx:get({'A', '111'}) --- src/box/errcode.h | 1 + src/box/field_map.c | 59 ++- src/box/field_map.h | 143 +++++- src/box/index_def.c | 5 + src/box/key_def.c | 132 ++++-- src/box/key_def.h | 34 ++ src/box/memtx_engine.c | 2 + src/box/memtx_space.c | 18 + src/box/memtx_tree.c | 272 +++++++++++- src/box/sql.c | 3 +- src/box/tuple.c | 27 +- src/box/tuple.h | 79 +++- src/box/tuple_compare.cc | 152 +++++-- src/box/tuple_compare.h | 10 + src/box/tuple_extract_key.cc | 5 +- src/box/tuple_format.c | 219 ++++++++-- src/box/tuple_format.h | 46 +- src/box/vinyl.c | 7 + src/box/vy_stmt.c | 6 +- test/box/misc.result | 1 + test/engine/engine.cfg | 3 + test/engine/json.result | 13 - test/engine/json.test.lua | 7 - test/engine/multikey.result | 789 ++++++++++++++++++++++++++++++++++ test/engine/multikey.test.lua | 206 +++++++++ 25 files changed, 2072 insertions(+), 167 deletions(-) create mode 100644 test/engine/multikey.result create mode 100644 test/engine/multikey.test.lua diff --git a/src/box/errcode.h b/src/box/errcode.h index 3f8cb8e0e..9c0a8e72c 100644 --- a/src/box/errcode.h +++ b/src/box/errcode.h @@ -246,6 +246,7 @@ struct errcode_record { /*191 */_(ER_SQL_PARSER_LIMIT, "%s %d exceeds the limit (%d)") \ /*192 */_(ER_INDEX_DEF_UNSUPPORTED, "%s are prohibited in an index definition") \ /*193 */_(ER_CK_DEF_UNSUPPORTED, "%s are prohibited in a CHECK constraint definition") \ + /*194 */_(ER_MULTIKEY_INDEX_MISMATCH, "Field %s is used as multikey in one index and as single key in another") \ /* * !IMPORTANT! Please follow instructions at start of the file diff --git a/src/box/field_map.c b/src/box/field_map.c index 690aa461d..5d25e3032 100644 --- a/src/box/field_map.c +++ b/src/box/field_map.c @@ -37,6 +37,7 @@ field_map_builder_create(struct field_map_builder *builder, uint32_t minimal_field_map_size, struct region *region) { + builder->extents_size = 0; builder->slot_count = minimal_field_map_size / sizeof(uint32_t); if (minimal_field_map_size == 0) { builder->slots = NULL; @@ -53,9 +54,63 @@ field_map_builder_create(struct field_map_builder *builder, return 0; } +struct field_map_builder_slot_extent * +field_map_builder_slot_extent_new(struct field_map_builder *builder, + int32_t offset_slot, uint32_t multikey_count, + struct region *region) +{ + struct field_map_builder_slot_extent *extent; + assert(!builder->slots[offset_slot].has_extent); + uint32_t sz = sizeof(struct field_map_builder_slot_extent) + + multikey_count * sizeof(uint32_t); + extent = region_alloc(region, sz); + if (extent == NULL) { + diag_set(OutOfMemory, sz, "region", "extent"); + return NULL; + } + memset(extent, 0, sz); + extent->size = multikey_count; + builder->slots[offset_slot].extent = extent; + builder->slots[offset_slot].has_extent = true; + builder->extents_size += sz; + return extent; +} + void field_map_build(struct field_map_builder *builder, char *buffer) { - uint32_t field_map_size = field_map_build_size(builder); - memcpy(buffer, (char *)builder->slots - field_map_size, field_map_size); + /* + * To initialize the field map and its extents, prepare + * the following memory layout with pointers: + * + * offset + * buffer +---------------------+ + * | | | + * [extentK] .. [extent1][[slotN]..[slot2][slot1]] + * | | | + * |extent_wptr | | |field_map + * -> -> <- + * + * The buffer size is assumed to be sufficient to write + * field_map_build_size(builder) bytes there. + */ + uint32_t *field_map = + (uint32_t *)(buffer + field_map_build_size(builder)); + char *extent_wptr = buffer; + for (int32_t i = -1; i >= -(int32_t)builder->slot_count; i--) { + if (!builder->slots[i].has_extent) { + field_map[i] = builder->slots[i].offset; + continue; + } + struct field_map_builder_slot_extent *extent = + builder->slots[i].extent; + /** Retrive memory for the extent. */ + uint32_t sz = sizeof(struct field_map_builder_slot_extent) + + extent->size * sizeof(uint32_t); + field_map[i] = + (uint32_t)((char *)extent_wptr - (char *)field_map); + memcpy(extent_wptr, extent, sz); + extent_wptr += sz; + } + assert(extent_wptr == buffer + builder->extents_size); } diff --git a/src/box/field_map.h b/src/box/field_map.h index f6c653030..11733baa4 100644 --- a/src/box/field_map.h +++ b/src/box/field_map.h @@ -32,8 +32,10 @@ */ #include #include +#include struct region; +struct field_map_builder_slot; /** * A field map is a special area is reserved before tuple's @@ -46,13 +48,15 @@ struct region; * offset_slot(s) is performed on tuple_format creation on index * create or alter (see tuple_format_create()). * - * 4b 4b MessagePack data. - * +------+----+------+---------------------------+ - * tuple: | offN | .. | off1 | header ..|key1|..|keyN|.. | - * +--+---+----+--+---+---------------------------+ - * | ... | ^ ^ - * | +-----------------+ | - * +--------------------------------------+ + * 4b 4b 4b 4b MessagePack data. + * +-----------+------+----+------+------------------------+ + *tuple: |cnt|off1|..| offN | .. | off1 | header ..|key1|..|keyN|| + * +-----+-----+--+---+----+--+---+------------------------+ + * ext1 ^ | | ... | ^ ^ + * +-----|--------+ | | | + * indirection | +-----------------+ | + * +----------------------------------------------+ + * (offset_slot = N, extent_slot = 1) --> offset * * This field_map_builder class is used for tuple field_map * construction. It encapsulates field_map build logic and size @@ -60,6 +64,14 @@ struct region; * * Each field offset is a positive number, except the case when * a field is not in the tuple. In this case offset is 0. + * + * In case of multikey index, the slot may refer to the + * "field_map_extent" sequence that contains an additional + * sequence of length defined before (the count of keys in the + * multikey index for given tuple). In such case offset slot + * represents int32_t negative value - the offset relative to + * the field_map pointer. The i-th extent's slot contains the + * positive offset of the i-th key field of the multikey index. */ struct field_map_builder { /** @@ -67,12 +79,60 @@ struct field_map_builder { * Its elements are accessible by negative indexes * that coinciding with offset_slot(s). */ - uint32_t *slots; + struct field_map_builder_slot *slots; /** * The count of slots in field_map_builder::slots * allocation. */ uint32_t slot_count; + /** + * Total size of memory is allocated for field_map + * extents. + */ + uint32_t extents_size; +}; + +/** + * Internal stucture representing field_map extent. + * (see field_map_builder description). + */ +struct field_map_builder_slot_extent { + /** + * Count of field_map_builder_slot_extent::offset[] + * elements. + */ + uint32_t size; + /** Data offset in tuple array. */ + uint32_t offset[0]; +}; + +/** + * Instead of using uint32_t offset slots directly the + * field_map_builder uses this structure as a storage atom. + * When there is a need to initialize an extent, the + * field_map_builder allocates a new memory chunk and sets + * field_map_builder_slot::pointer (instead of real field_map + * reallocation). + * + * On field_map_build, all of the field_map_builder_slot_extent(s) + * are dumped to the same memory chunk that the regular field_map + * slots and corresponding slots are initialized with negative + * extent offset. + * + * The allocated memory is accounted for in extents_size. + */ +struct field_map_builder_slot { + /** + * True when this slot must be interpret as + * extent pointer. + */ + bool has_extent; + union { + /** Data offset in tuple. */ + uint32_t offset; + /** Pointer to field_map extent. */ + struct field_map_builder_slot_extent *extent; + }; }; /** @@ -82,9 +142,25 @@ struct field_map_builder { * When a field is not in the data tuple, its offset is 0. */ static inline uint32_t -field_map_get_offset(const uint32_t *field_map, int32_t offset_slot) +field_map_get_offset(const uint32_t *field_map, int32_t offset_slot, + int multikey_idx) { - return field_map[offset_slot]; + uint32_t offset; + if (multikey_idx >= 0 && field_map[offset_slot] > 0) { + assert((int32_t)field_map[offset_slot] < 0); + /** + * The field_map extent has the following + * structure: [size=N|slot1|slot2|..|slotN] + */ + uint32_t *extent = (uint32_t *)((char *)field_map + + (int32_t)field_map[offset_slot]); + if ((uint32_t)multikey_idx >= extent[0]) + return 0; + offset = extent[multikey_idx + 1]; + } else { + offset = field_map[offset_slot]; + } + return offset; } /** @@ -117,7 +193,49 @@ field_map_builder_set_slot(struct field_map_builder *builder, assert(offset_slot < 0); assert((uint32_t)-offset_slot <= builder->slot_count); assert(offset > 0); - builder->slots[offset_slot] = offset; + builder->slots[offset_slot].offset = offset; +} + +/** + * Internal function to allocate field map extent by offset_slot + * and count of multikey keys. + */ +struct field_map_builder_slot_extent * +field_map_builder_slot_extent_new(struct field_map_builder *builder, + int32_t offset_slot, uint32_t multikey_count, + struct region *region); + +/** + * Set data offset in field map extent (by given offset_slot, + * multikey_idx and multikey_count) for a field identified by + * unique offset_slot. + * + * The offset_slot argument must be negative and offset must be + * positive (by definition). + */ +static inline int +field_map_builder_set_slot_multikey(struct field_map_builder *builder, + int32_t offset_slot, uint32_t offset, + int32_t multikey_idx, uint32_t multikey_count, + struct region *region) +{ + assert(offset_slot < 0); + assert(offset > 0); + assert(multikey_idx >= 0 && multikey_count > 0); + assert(multikey_idx < (int32_t)multikey_count); + struct field_map_builder_slot_extent *extent; + if (builder->slots[offset_slot].has_extent) { + extent = builder->slots[offset_slot].extent; + assert(extent != NULL); + assert(extent->size == multikey_count); + } else { + extent = field_map_builder_slot_extent_new(builder, offset_slot, + multikey_count, region); + if (extent == NULL) + return -1; + } + extent->offset[multikey_idx] = offset; + return 0; } /** @@ -126,7 +244,8 @@ field_map_builder_set_slot(struct field_map_builder *builder, static inline uint32_t field_map_build_size(struct field_map_builder *builder) { - return builder->slot_count * sizeof(uint32_t); + return builder->slot_count * sizeof(uint32_t) + + builder->extents_size; } /** diff --git a/src/box/index_def.c b/src/box/index_def.c index 476d9f8d3..28de89274 100644 --- a/src/box/index_def.c +++ b/src/box/index_def.c @@ -291,6 +291,11 @@ index_def_is_valid(struct index_def *index_def, const char *space_name) space_name, "too many key parts"); return false; } + if (index_def->iid == 0 && key_def_is_multikey(index_def->key_def)) { + diag_set(ClientError, ER_MODIFY_INDEX, index_def->name, + space_name, "primary key cannot be multikey"); + return false; + } for (uint32_t i = 0; i < index_def->key_def->part_count; i++) { assert(index_def->key_def->parts[i].type < field_type_MAX); if (index_def->key_def->parts[i].fieldno > BOX_INDEX_FIELD_MAX) { diff --git a/src/box/key_def.c b/src/box/key_def.c index d07bbe8bc..1366c46b8 100644 --- a/src/box/key_def.c +++ b/src/box/key_def.c @@ -104,6 +104,10 @@ key_def_dup(const struct key_def *src) size_t path_offset = src->parts[i].path - (char *)src; res->parts[i].path = (char *)res + path_offset; } + if (src->multikey_path != NULL) { + size_t path_offset = src->multikey_path - (char *)src; + res->multikey_path = (char *)res + path_offset; + } return res; } @@ -138,7 +142,69 @@ key_def_set_func(struct key_def *def) key_def_set_extract_func(def); } -static void +static int +key_def_set_part_path(struct key_def *def, uint32_t part_no, const char *path, + uint32_t path_len, char **path_pool) +{ + struct key_part *part = &def->parts[part_no]; + if (path == NULL) { + part->path = NULL; + part->path_len = 0; + return 0; + } + assert(path_pool != NULL); + part->path = *path_pool; + *path_pool += path_len; + memcpy(part->path, path, path_len); + part->path_len = path_len; + + /* + * Test whether this key_part has array index + * placeholder [*] (i.e. is a part of multikey index + * definition). + */ + int multikey_path_len = + json_path_multikey_offset(path, path_len, TUPLE_INDEX_BASE); + if ((uint32_t) multikey_path_len == path_len) + return 0; + + /* + * In case of multikey index key_parts must have the + * same JSON prefix. + */ + if (def->multikey_path == NULL) { + /* + * Keep the index of the first multikey key_part + * and the length of JSON path substring to the + * array index placeholder sign [*]. + */ + def->multikey_path = part->path; + def->multikey_fieldno = part->fieldno; + def->multikey_path_len = (uint32_t) multikey_path_len; + } else if (json_path_cmp(path, multikey_path_len, def->multikey_path, + def->multikey_path_len, + TUPLE_INDEX_BASE) != 0) { + diag_set(ClientError, ER_WRONG_INDEX_OPTIONS, + part_no + TUPLE_INDEX_BASE, + "incompatable multikey index path"); + return -1; + } + int multikey_path_suffix_len = + path_len - multikey_path_len - strlen("[*]"); + if (json_path_multikey_offset(path + multikey_path_len + strlen("[*]"), + multikey_path_suffix_len, + TUPLE_INDEX_BASE) != + multikey_path_suffix_len) { + diag_set(ClientError, ER_WRONG_INDEX_OPTIONS, + part_no + TUPLE_INDEX_BASE, + "no more than one array index placeholder [*] is " + "allowed in JSON path"); + return -1; + } + return 0; +} + +static int key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno, enum field_type type, enum on_conflict_action nullable_action, struct coll *coll, uint32_t coll_id, @@ -158,17 +224,8 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno, def->parts[part_no].sort_order = sort_order; def->parts[part_no].offset_slot_cache = offset_slot; def->parts[part_no].format_epoch = format_epoch; - if (path != NULL) { - assert(path_pool != NULL); - def->parts[part_no].path = *path_pool; - *path_pool += path_len; - memcpy(def->parts[part_no].path, path, path_len); - def->parts[part_no].path_len = path_len; - } else { - def->parts[part_no].path = NULL; - def->parts[part_no].path_len = 0; - } column_mask_set_fieldno(&def->column_mask, fieldno); + return key_def_set_part_path(def, part_no, path, path_len, path_pool); } struct key_def * @@ -203,10 +260,14 @@ key_def_new(const struct key_part_def *parts, uint32_t part_count) coll = coll_id->coll; } uint32_t path_len = part->path != NULL ? strlen(part->path) : 0; - key_def_set_part(def, i, part->fieldno, part->type, - part->nullable_action, coll, part->coll_id, - part->sort_order, part->path, path_len, - &path_pool, TUPLE_OFFSET_SLOT_NIL, 0); + if (key_def_set_part(def, i, part->fieldno, part->type, + part->nullable_action, coll, part->coll_id, + part->sort_order, part->path, path_len, + &path_pool, TUPLE_OFFSET_SLOT_NIL, + 0) != 0) { + key_def_delete(def); + return NULL; + } } assert(path_pool == (char *)def + sz); key_def_set_func(def); @@ -256,11 +317,14 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count) key_def->unique_part_count = part_count; for (uint32_t item = 0; item < part_count; ++item) { - key_def_set_part(key_def, item, fields[item], - (enum field_type)types[item], - ON_CONFLICT_ACTION_DEFAULT, - NULL, COLL_NONE, SORT_ORDER_ASC, NULL, 0, - NULL, TUPLE_OFFSET_SLOT_NIL, 0); + if (key_def_set_part(key_def, item, fields[item], + (enum field_type)types[item], + ON_CONFLICT_ACTION_DEFAULT, NULL, + COLL_NONE, SORT_ORDER_ASC, NULL, 0, NULL, + TUPLE_OFFSET_SLOT_NIL, 0) != 0) { + key_def_delete(key_def); + return NULL; + } } key_def_set_func(key_def); return key_def; @@ -685,11 +749,15 @@ key_def_merge(const struct key_def *first, const struct key_def *second) part = first->parts; end = part + first->part_count; for (; part != end; part++) { - key_def_set_part(new_def, pos++, part->fieldno, part->type, - part->nullable_action, part->coll, - part->coll_id, part->sort_order, part->path, - part->path_len, &path_pool, - part->offset_slot_cache, part->format_epoch); + if (key_def_set_part(new_def, pos++, part->fieldno, part->type, + part->nullable_action, part->coll, + part->coll_id, part->sort_order, + part->path, part->path_len, &path_pool, + part->offset_slot_cache, + part->format_epoch) != 0) { + key_def_delete(new_def); + return NULL; + } } /* Set-append second key def's part to the new key def. */ @@ -698,11 +766,15 @@ key_def_merge(const struct key_def *first, const struct key_def *second) for (; part != end; part++) { if (!key_def_can_merge(first, part)) continue; - key_def_set_part(new_def, pos++, part->fieldno, part->type, - part->nullable_action, part->coll, - part->coll_id, part->sort_order, part->path, - part->path_len, &path_pool, - part->offset_slot_cache, part->format_epoch); + if (key_def_set_part(new_def, pos++, part->fieldno, part->type, + part->nullable_action, part->coll, + part->coll_id, part->sort_order, + part->path, part->path_len, &path_pool, + part->offset_slot_cache, + part->format_epoch) != 0) { + key_def_delete(new_def); + return NULL; + } } assert(path_pool == (char *)new_def + sz); key_def_set_func(new_def); diff --git a/src/box/key_def.h b/src/box/key_def.h index 6205c278a..fe4fd095c 100644 --- a/src/box/key_def.h +++ b/src/box/key_def.h @@ -214,6 +214,34 @@ struct key_def { bool has_optional_parts; /** Key fields mask. @sa column_mask.h for details. */ uint64_t column_mask; + /** + * In case of the multikey index, a pointer to the + * JSON path string, the path to the root node of + * multikey index that contains the array having + * index placeholder sign [*]. + * + * This pointer duplicates the JSON path of some key_part. + * This path is not 0-terminated. Moreover, it is only + * JSON path subpath so key_def::multikey_path_len must + * be directly used in all cases. + * + * This field is not NULL iff this is multikey index + * key definition. + */ + const char *multikey_path; + /** + * The length of the key_def::multikey_path. + * Valid when key_def_is_multikey(key_def) is true, + * undefined otherwise. + */ + uint32_t multikey_path_len; + /** + * The index of the root field of the multikey JSON + * path index key_def::multikey_path. + * Valid when key_def_is_multikey(key_def) is true, + * undefined otherwise. + */ + uint32_t multikey_fieldno; /** The size of the 'parts' array. */ uint32_t part_count; /** Description of parts of a multipart index. */ @@ -452,6 +480,12 @@ key_def_is_sequential(const struct key_def *key_def) return true; } +static inline bool +key_def_is_multikey(const struct key_def *key_def) +{ + return key_def->multikey_path != NULL; +} + /** * Return true if @a key_def defines has fields that requires * special collation comparison. diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c index 210f3c22c..65e8804a4 100644 --- a/src/box/memtx_engine.c +++ b/src/box/memtx_engine.c @@ -1330,5 +1330,7 @@ memtx_index_def_change_requires_rebuild(struct index *index, TUPLE_INDEX_BASE) != 0) return true; } + assert(key_def_is_multikey(old_cmp_def) == + key_def_is_multikey(new_cmp_def)); return false; } diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c index a28204d30..c3c9ac79b 100644 --- a/src/box/memtx_space.c +++ b/src/box/memtx_space.c @@ -637,6 +637,12 @@ memtx_space_check_index_def(struct space *space, struct index_def *index_def) "HASH index must be unique"); return -1; } + if (key_def_is_multikey(index_def->key_def)) { + diag_set(ClientError, ER_MODIFY_INDEX, + index_def->name, space_name(space), + "HASH index cannot be multikey"); + return -1; + } break; case TREE: /* TREE index has no limitations. */ @@ -660,6 +666,12 @@ memtx_space_check_index_def(struct space *space, struct index_def *index_def) "RTREE index field type must be ARRAY"); return -1; } + if (key_def_is_multikey(index_def->key_def)) { + diag_set(ClientError, ER_MODIFY_INDEX, + index_def->name, space_name(space), + "RTREE index cannot be multikey"); + return -1; + } /* no furter checks of parts needed */ return 0; case BITSET: @@ -682,6 +694,12 @@ memtx_space_check_index_def(struct space *space, struct index_def *index_def) "BITSET index field type must be NUM or STR"); return -1; } + if (key_def_is_multikey(index_def->key_def)) { + diag_set(ClientError, ER_MODIFY_INDEX, + index_def->name, space_name(space), + "BITSET index cannot be multikey"); + return -1; + } /* no furter checks of parts needed */ return 0; default: diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c index fe037c54a..9a0834994 100644 --- a/src/box/memtx_tree.c +++ b/src/box/memtx_tree.c @@ -584,6 +584,146 @@ memtx_tree_index_replace(struct index *base, struct tuple *old_tuple, return 0; } +/** + * Perform tuple insertion by given multikey index. + * In case of replacement, all old tuple entries are deleted + * by all it's multikey indexes. + */ +static int +memtx_tree_index_insert_multikey(struct memtx_tree_index *index, + struct tuple *old_tuple, struct tuple *new_tuple, + enum dup_replace_mode mode, int multikey_idx, + struct tuple **replaced_tuple) +{ + struct memtx_tree_data new_data, dup_data; + new_data.tuple = new_tuple; + new_data.hint = multikey_idx; + dup_data.tuple = NULL; + if (memtx_tree_insert(&index->tree, new_data, &dup_data) != 0) { + diag_set(OutOfMemory, MEMTX_EXTENT_SIZE, "memtx_tree_index", + "replace"); + return -1; + } + int errcode = 0; + if (dup_data.tuple == new_tuple) { + /* + * When tuple contains the same key multiple + * times, the previous key occurrence is pushed + * out of the index. + */ + dup_data.tuple = NULL; + } else if ((errcode = replace_check_dup(old_tuple, dup_data.tuple, + mode)) != 0) { + /* Rollback replace. */ + memtx_tree_delete(&index->tree, new_data); + if (dup_data.tuple != NULL) + memtx_tree_insert(&index->tree, dup_data, NULL); + struct space *sp = space_cache_find(index->base.def->space_id); + if (sp != NULL) { + diag_set(ClientError, errcode, index->base.def->name, + space_name(sp)); + } + return -1; + } + *replaced_tuple = dup_data.tuple; + if (dup_data.tuple == NULL) + return 0; + + /* + * Propagate dup_data.tuple deletion for all multikey + * index keys extracted by dup_data.tuple. + */ + int conflict_idx = (int)dup_data.hint; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); + uint32_t multikey_count = tuple_multikey_count(dup_data.tuple, cmp_def); + for (int i = 0; (uint32_t) i < multikey_count; i++) { + if (conflict_idx == i) + continue; + dup_data.hint = i; + memtx_tree_delete(&index->tree, dup_data); + } + /* + * The new_tuple multikey index key (by conflict_idx) may + * be deleted from index when old_tuple has another key by + * some other multikey index != conflict_idx. Restore it + * as a precaution. + */ + memtx_tree_insert(&index->tree, new_data, NULL); + return 0; +} + +static int +memtx_tree_index_replace_multikey(struct index *base, struct tuple *old_tuple, + struct tuple *new_tuple, enum dup_replace_mode mode, + struct tuple **result) +{ + struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); + *result = NULL; + if (new_tuple != NULL) { + int multikey_idx = 0, err; + uint32_t multikey_count = + tuple_multikey_count(new_tuple, cmp_def); + for (; (uint32_t) multikey_idx < multikey_count; + multikey_idx++) { + struct tuple *replaced_tuple; + err = memtx_tree_index_insert_multikey(index, old_tuple, + new_tuple, mode, multikey_idx, + &replaced_tuple); + if (err != 0) + break; + if (replaced_tuple != NULL) { + assert(*result == NULL || + *result == replaced_tuple); + *result = replaced_tuple; + } + } + if (err != 0) { + /* + * In case of error, rollback new_tuple + * insertion by multikey index [0, multikey_idx). + */ + struct memtx_tree_data data; + data.tuple = new_tuple; + for (int i = 0; i < multikey_idx; i++) { + data.hint = i; + memtx_tree_delete(&index->tree, data); + } + if (*result != NULL) { + /* + * Restore replaced tuple index + * occurrences. + */ + data.tuple = *result; + multikey_count = + tuple_multikey_count(*result, cmp_def); + for (int i = 0; + (uint32_t) i < multikey_count; i++) { + data.hint = i; + memtx_tree_insert(&index->tree, data, + NULL); + } + } + return -1; + } + if (*result != NULL) + return 0; + } + if (old_tuple != NULL) { + *result = old_tuple; + struct memtx_tree_data data; + data.tuple = old_tuple; + uint32_t multikey_count = + tuple_multikey_count(old_tuple, cmp_def); + for (int multikey_idx = 0; + (uint32_t) multikey_idx < multikey_count; multikey_idx++) { + data.hint = multikey_idx; + memtx_tree_delete(&index->tree, data); + } + } + return 0; +} + static struct iterator * memtx_tree_index_create_iterator(struct index *base, enum iterator_type type, const char *key, uint32_t part_count) @@ -655,31 +795,32 @@ memtx_tree_index_reserve(struct index *base, uint32_t size_hint) return 0; } +/* Initialize the next element of the index build_array. */ static int -memtx_tree_index_build_next(struct index *base, struct tuple *tuple) +memtx_tree_index_build_array_append(struct memtx_tree_index *index, + struct tuple *tuple, hint_t hint) { - struct memtx_tree_index *index = (struct memtx_tree_index *)base; - struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); + bool need_realloc = false; if (index->build_array == NULL) { - index->build_array = malloc(MEMTX_EXTENT_SIZE); - if (index->build_array == NULL) { - diag_set(OutOfMemory, MEMTX_EXTENT_SIZE, - "memtx_tree_index", "build_next"); - return -1; - } index->build_array_alloc_size = MEMTX_EXTENT_SIZE / sizeof(index->build_array[0]); + need_realloc = true; + } + if (index->build_array_size >= index->build_array_alloc_size) { + index->build_array_alloc_size += + MAX(index->build_array_alloc_size / 2, + index->build_array_size + 1 - + index->build_array_alloc_size); + need_realloc = true; } - assert(index->build_array_size <= index->build_array_alloc_size); - if (index->build_array_size == index->build_array_alloc_size) { - index->build_array_alloc_size = index->build_array_alloc_size + - index->build_array_alloc_size / 2; + if (need_realloc) { struct memtx_tree_data *tmp = realloc(index->build_array, index->build_array_alloc_size * sizeof(*tmp)); if (tmp == NULL) { diag_set(OutOfMemory, index->build_array_alloc_size * - sizeof(*tmp), "memtx_tree_index", "build_next"); + sizeof(*tmp), "memtx_tree_index", + "build_array"); return -1; } index->build_array = tmp; @@ -687,10 +828,66 @@ memtx_tree_index_build_next(struct index *base, struct tuple *tuple) struct memtx_tree_data *elem = &index->build_array[index->build_array_size++]; elem->tuple = tuple; - elem->hint = tuple_hint(tuple, cmp_def); + elem->hint = hint; + return 0; +} + +static int +memtx_tree_index_build_next(struct index *base, struct tuple *tuple) +{ + struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); + return memtx_tree_index_build_array_append(index, tuple, + tuple_hint(tuple, cmp_def)); +} + +static int +memtx_tree_index_build_next_multikey(struct index *base, struct tuple *tuple) +{ + struct memtx_tree_index *index = (struct memtx_tree_index *)base; + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); + uint32_t multikey_count = tuple_multikey_count(tuple, cmp_def); + for (uint32_t multikey_idx = 0; multikey_idx < multikey_count; + multikey_idx++) { + if (memtx_tree_index_build_array_append(index, tuple, + multikey_idx) != 0) + return -1; + } return 0; } +/** + * Process build_array of specified index and remove duplicates + * of equal tuples (in terms of index's cmp_def and have same + * tuple pointer). The build_array is expected to be sorted. + */ +static void +memtx_tree_index_build_array_deduplicate(struct memtx_tree_index *index) +{ + struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); + size_t i = 0; + while (i < index->build_array_size) { + size_t next_key = ++i; + while (next_key < index->build_array_size && + index->build_array[next_key].tuple == + index->build_array[next_key - 1].tuple && + tuple_compare_hinted(index->build_array[next_key].tuple, + index->build_array[next_key].hint, + index->build_array[next_key - 1].tuple, + index->build_array[next_key - 1].hint, + cmp_def) == 0) + next_key++; + size_t equal_keys = next_key - i; + if (equal_keys == 0) + continue; + memmove(&index->build_array[i - 1], + &index->build_array[next_key - 1], + (index->build_array_size - next_key + 1) * + sizeof(index->build_array[0])); + index->build_array_size -= equal_keys; + } +} + static void memtx_tree_index_end_build(struct index *base) { @@ -698,6 +895,16 @@ memtx_tree_index_end_build(struct index *base) struct key_def *cmp_def = memtx_tree_cmp_def(&index->tree); qsort_arg(index->build_array, index->build_array_size, sizeof(index->build_array[0]), memtx_tree_qcompare, cmp_def); + if (key_def_is_multikey(cmp_def)) { + /* + * Multikey index may have equal(in terms of + * cmp_def) keys inserted by different multikey + * offsets. We must deduplicate them because + * the following memtx_tree_build assumes that + * all keys are unique. + */ + memtx_tree_index_build_array_deduplicate(index); + } memtx_tree_build(&index->tree, index->build_array, index->build_array_size); @@ -793,6 +1000,36 @@ static const struct index_vtab memtx_tree_index_vtab = { /* .end_build = */ memtx_tree_index_end_build, }; +static const struct index_vtab memtx_tree_index_multikey_vtab = { + /* .destroy = */ memtx_tree_index_destroy, + /* .commit_create = */ generic_index_commit_create, + /* .abort_create = */ generic_index_abort_create, + /* .commit_modify = */ generic_index_commit_modify, + /* .commit_drop = */ generic_index_commit_drop, + /* .update_def = */ memtx_tree_index_update_def, + /* .depends_on_pk = */ memtx_tree_index_depends_on_pk, + /* .def_change_requires_rebuild = */ + memtx_index_def_change_requires_rebuild, + /* .size = */ memtx_tree_index_size, + /* .bsize = */ memtx_tree_index_bsize, + /* .min = */ generic_index_min, + /* .max = */ generic_index_max, + /* .random = */ memtx_tree_index_random, + /* .count = */ memtx_tree_index_count, + /* .get = */ memtx_tree_index_get, + /* .replace = */ memtx_tree_index_replace_multikey, + /* .create_iterator = */ memtx_tree_index_create_iterator, + /* .create_snapshot_iterator = */ + memtx_tree_index_create_snapshot_iterator, + /* .stat = */ generic_index_stat, + /* .compact = */ generic_index_compact, + /* .reset_stat = */ generic_index_reset_stat, + /* .begin_build = */ memtx_tree_index_begin_build, + /* .reserve = */ memtx_tree_index_reserve, + /* .build_next = */ memtx_tree_index_build_next_multikey, + /* .end_build = */ memtx_tree_index_end_build, +}; + struct index * memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def) { @@ -803,8 +1040,11 @@ memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def) "malloc", "struct memtx_tree_index"); return NULL; } + const struct index_vtab *vtab = key_def_is_multikey(def->key_def) ? + &memtx_tree_index_multikey_vtab : + &memtx_tree_index_vtab; if (index_create(&index->base, (struct engine *)memtx, - &memtx_tree_index_vtab, def) != 0) { + vtab, def) != 0) { free(index); return NULL; } diff --git a/src/box/sql.c b/src/box/sql.c index 2310ee5e3..7a20d86e0 100644 --- a/src/box/sql.c +++ b/src/box/sql.c @@ -782,7 +782,8 @@ tarantoolsqlIdxKeyCompare(struct BtCursor *cursor, } else { uint32_t field_offset = field_map_get_offset(field_map, - field->offset_slot); + field->offset_slot, + -1); p = base + field_offset; } } diff --git a/src/box/tuple.c b/src/box/tuple.c index c325f58c9..35194001f 100644 --- a/src/box/tuple.c +++ b/src/box/tuple.c @@ -453,7 +453,8 @@ tuple_field_go_to_key(const char **field, const char *key, int len) } int -tuple_go_to_path(const char **data, const char *path, uint32_t path_len) +tuple_go_to_path(const char **data, const char *path, uint32_t path_len, + int multikey_idx) { int rc; struct json_lexer lexer; @@ -461,6 +462,11 @@ tuple_go_to_path(const char **data, const char *path, uint32_t path_len) json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE); while ((rc = json_lexer_next_token(&lexer, &token)) == 0) { switch (token.type) { + case JSON_TOKEN_ANY: + if (multikey_idx < 0) + return -1; + token.num = multikey_idx; + FALLTHROUGH; case JSON_TOKEN_NUM: rc = tuple_field_go_to_index(data, token.num); break; @@ -530,7 +536,24 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple, } return tuple_field_raw_by_path(format, tuple, field_map, fieldno, path + lexer.offset, - path_len - lexer.offset, NULL); + path_len - lexer.offset, NULL, -1); +} + +uint32_t +tuple_raw_multikey_count(struct tuple_format *format, const char *data, + const uint32_t *field_map, + struct key_def *key_def) +{ + assert(key_def_is_multikey(key_def)); + const char *array_raw = + tuple_field_raw_by_path(format, data, field_map, + key_def->multikey_fieldno, + key_def->multikey_path, + key_def->multikey_path_len, NULL, -1); + if (array_raw == NULL) + return 0; + assert(mp_typeof(*array_raw) == MP_ARRAY); + return mp_decode_array(&array_raw); } /* }}} tuple_field_* getters */ diff --git a/src/box/tuple.h b/src/box/tuple.h index d261d4cc9..c0d06dbe5 100644 --- a/src/box/tuple.h +++ b/src/box/tuple.h @@ -506,14 +506,19 @@ tuple_field_count(struct tuple *tuple) * with NULL. * @param path The path to process. * @param path_len The length of the @path. + * @param multikey_idx The multikey index hint - index of + * multikey index key to retrieve when array + * index placeholder "[*]" is met. * @retval 0 On success. * @retval -1 In case of error in JSON path. */ int -tuple_go_to_path(const char **data, const char *path, uint32_t path_len); +tuple_go_to_path(const char **data, const char *path, uint32_t path_len, + int multikey_idx); /** - * Get tuple field by field index and relative JSON path. + * Get tuple field by field index, relative JSON path and + * multikey_idx. * @param format Tuple format. * @param tuple MessagePack tuple's body. * @param field_map Tuple field map. @@ -526,12 +531,15 @@ tuple_go_to_path(const char **data, const char *path, uint32_t path_len); * access data in a single operation. * Else it is initialized with offset_slot * of format field by path. + * @param multikey_idx The multikey index hint - index of + * multikey item item to retrieve when array + * index placeholder "[*]" is met. */ static inline const char * tuple_field_raw_by_path(struct tuple_format *format, const char *tuple, const uint32_t *field_map, uint32_t fieldno, const char *path, uint32_t path_len, - int32_t *offset_slot_hint) + int32_t *offset_slot_hint, int multikey_idx) { int32_t offset_slot; if (offset_slot_hint != NULL && @@ -558,7 +566,8 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple, *offset_slot_hint = offset_slot; offset_slot_access: /* Indexed field */ - offset = field_map_get_offset(field_map, offset_slot); + offset = field_map_get_offset(field_map, offset_slot, + multikey_idx); if (offset == 0) return NULL; tuple += offset; @@ -572,8 +581,8 @@ parse: for (uint32_t k = 0; k < fieldno; k++) mp_next(&tuple); if (path != NULL && - unlikely(tuple_go_to_path(&tuple, path, - path_len) != 0)) + unlikely(tuple_go_to_path(&tuple, path, path_len, + multikey_idx) != 0)) return NULL; } return tuple; @@ -595,7 +604,7 @@ tuple_field_raw(struct tuple_format *format, const char *tuple, const uint32_t *field_map, uint32_t field_no) { return tuple_field_raw_by_path(format, tuple, field_map, field_no, - NULL, 0, NULL); + NULL, 0, NULL, -1); } /** @@ -634,16 +643,19 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple, uint32_t path_len, uint32_t path_hash); /** - * Get a tuple field pointed to by an index part. + * Get a tuple field pointed to by an index part and multikey + * index hint. * @param format Tuple format. - * @param tuple A pointer to MessagePack array. + * @param data A pointer to MessagePack array. * @param field_map A pointer to the LAST element of field map. * @param part Index part to use. + * @param multikey_idx A multikey index hint. * @retval Field data if the field exists or NULL. */ static inline const char * -tuple_field_raw_by_part(struct tuple_format *format, const char *data, - const uint32_t *field_map, struct key_part *part) +tuple_field_raw_by_part_multikey(struct tuple_format *format, const char *data, + const uint32_t *field_map, + struct key_part *part, int multikey_idx) { if (unlikely(part->format_epoch != format->epoch)) { assert(format->epoch != 0); @@ -656,7 +668,23 @@ tuple_field_raw_by_part(struct tuple_format *format, const char *data, } return tuple_field_raw_by_path(format, data, field_map, part->fieldno, part->path, part->path_len, - &part->offset_slot_cache); + &part->offset_slot_cache, multikey_idx); +} + +/** + * Get a tuple field pointed to by an index part. + * @param format Tuple format. + * @param data A pointer to MessagePack array. + * @param field_map A pointer to the LAST element of field map. + * @param part Index part to use. + * @retval Field data if the field exists or NULL. + */ +static inline const char * +tuple_field_raw_by_part(struct tuple_format *format, const char *data, + const uint32_t *field_map, struct key_part *part) +{ + return tuple_field_raw_by_part_multikey(format, data, field_map, + part, -1); } /** @@ -672,6 +700,33 @@ tuple_field_by_part(struct tuple *tuple, struct key_part *part) tuple_field_map(tuple), part); } +/** + * Get count of multikey index keys in tuple by given multikey + * index definition. + * @param format Tuple format. + * @param data A pointer to MessagePack array. + * @param field_map A pointer to the LAST element of field map. + * @param key_def Index key_definition. + * @retval Count of multikey index keys in the given tuple. + */ +uint32_t +tuple_raw_multikey_count(struct tuple_format *format, const char *data, + const uint32_t *field_map, struct key_def *key_def); + +/** + * Get count of multikey index keys in tuple by given multikey + * index definition. + * @param tuple Tuple to get the count of multikey keys from. + * @param key_def Index key_definition. + * @retval Count of multikey index keys in the given tuple. + */ +static inline uint32_t +tuple_multikey_count(struct tuple *tuple, struct key_def *key_def) +{ + return tuple_raw_multikey_count(tuple_format(tuple), tuple_data(tuple), + tuple_field_map(tuple), key_def); +} + /** * @brief Tuple Interator */ diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc index a3cabf19f..af9aa7b6c 100644 --- a/src/box/tuple_compare.cc +++ b/src/box/tuple_compare.cc @@ -375,6 +375,29 @@ mp_compare_scalar_coll(const char *field_a, const char *field_b, return mp_compare_scalar_with_type(field_a, type_a, field_b, type_b); } +static inline int +tuple_compare_multikey(struct tuple *tuple_a, struct tuple *tuple_b, + struct key_def *key_def) +{ + (void)tuple_a; + (void)tuple_b; + (void)key_def; + unreachable(); + return 0; +} + +static inline int +tuple_compare_with_key_multikey(struct tuple *tuple, const char *key, + uint32_t part_count, struct key_def *key_def) +{ + (void) tuple; + (void) key; + (void) part_count; + (void) key_def; + unreachable(); + return 0; +} + /** * @brief Compare two fields parts using a type definition * @param field_a field @@ -445,7 +468,8 @@ tuple_compare_field_with_type(const char *field_a, enum mp_type a_type, } } -template +template static inline int tuple_compare_slowpath_hinted(struct tuple *tuple_a, hint_t tuple_a_hint, struct tuple *tuple_b, hint_t tuple_b_hint, @@ -455,8 +479,11 @@ tuple_compare_slowpath_hinted(struct tuple *tuple_a, hint_t tuple_a_hint, assert(!has_optional_parts || is_nullable); assert(is_nullable == key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); - int rc = hint_cmp(tuple_a_hint, tuple_b_hint); - if (rc != 0) + assert(key_def_is_multikey(key_def) == is_multikey); + assert(!is_multikey || (is_multikey == has_json_paths && + tuple_a_hint != HINT_NONE && tuple_b_hint != HINT_NONE)); + int rc = 0; + if (!is_multikey && (rc = hint_cmp(tuple_a_hint, tuple_b_hint)) != 0) return rc; struct key_part *part = key_def->parts; const char *tuple_a_raw = tuple_data(tuple_a); @@ -499,7 +526,14 @@ tuple_compare_slowpath_hinted(struct tuple *tuple_a, hint_t tuple_a_hint, end = part + key_def->part_count; for (; part < end; part++) { - if (has_json_paths) { + if (is_multikey) { + field_a = tuple_field_raw_by_part_multikey(format_a, + tuple_a_raw, field_map_a, part, + (int)tuple_a_hint); + field_b = tuple_field_raw_by_part_multikey(format_b, + tuple_b_raw, field_map_b, part, + (int)tuple_b_hint); + } else if (has_json_paths) { field_a = tuple_field_raw_by_part(format_a, tuple_a_raw, field_map_a, part); field_b = tuple_field_raw_by_part(format_b, tuple_b_raw, @@ -556,7 +590,14 @@ tuple_compare_slowpath_hinted(struct tuple *tuple_a, hint_t tuple_a_hint, */ end = key_def->parts + key_def->part_count; for (; part < end; ++part) { - if (has_json_paths) { + if (is_multikey) { + field_a = tuple_field_raw_by_part_multikey(format_a, + tuple_a_raw, field_map_a, part, + (int)tuple_a_hint); + field_b = tuple_field_raw_by_part_multikey(format_b, + tuple_b_raw, field_map_b, part, + (int)tuple_b_hint); + } else if (has_json_paths) { field_a = tuple_field_raw_by_part(format_a, tuple_a_raw, field_map_a, part); field_b = tuple_field_raw_by_part(format_b, tuple_b_raw, @@ -586,11 +627,12 @@ tuple_compare_slowpath(struct tuple *tuple_a, struct tuple *tuple_b, struct key_def *key_def) { return tuple_compare_slowpath_hinted - + (tuple_a, HINT_NONE, tuple_b, HINT_NONE, key_def); } -template +template static inline int tuple_compare_with_key_slowpath_hinted(struct tuple *tuple, hint_t tuple_hint, const char *key, uint32_t part_count, @@ -602,8 +644,11 @@ tuple_compare_with_key_slowpath_hinted(struct tuple *tuple, assert(has_optional_parts == key_def->has_optional_parts); assert(key != NULL || part_count == 0); assert(part_count <= key_def->part_count); - int rc = hint_cmp(tuple_hint, key_hint); - if (rc != 0) + assert(key_def_is_multikey(key_def) == is_multikey); + assert(!is_multikey || (is_multikey == has_json_paths && + tuple_hint != HINT_NONE && key_hint == HINT_NONE)); + int rc = 0; + if (!is_multikey && (rc = hint_cmp(tuple_hint, key_hint)) != 0) return rc; struct key_part *part = key_def->parts; struct tuple_format *format = tuple_format(tuple); @@ -612,7 +657,11 @@ tuple_compare_with_key_slowpath_hinted(struct tuple *tuple, enum mp_type a_type, b_type; if (likely(part_count == 1)) { const char *field; - if (has_json_paths) { + if (is_multikey) { + field = tuple_field_raw_by_part_multikey(format, + tuple_raw, field_map, part, + (int)tuple_hint); + } else if (has_json_paths) { field = tuple_field_raw_by_part(format, tuple_raw, field_map, part); } else { @@ -642,7 +691,11 @@ tuple_compare_with_key_slowpath_hinted(struct tuple *tuple, struct key_part *end = part + part_count; for (; part < end; ++part, mp_next(&key)) { const char *field; - if (has_json_paths) { + if (is_multikey) { + field = tuple_field_raw_by_part_multikey(format, + tuple_raw, field_map, part, + (int)tuple_hint); + } else if (has_json_paths) { field = tuple_field_raw_by_part(format, tuple_raw, field_map, part); } else { @@ -684,7 +737,7 @@ tuple_compare_with_key_slowpath(struct tuple *tuple, const char *key, uint32_t part_count, struct key_def *key_def) { return tuple_compare_with_key_slowpath_hinted - + (tuple, HINT_NONE, key, part_count, HINT_NONE, key_def); } @@ -1367,6 +1420,9 @@ static const comparator_with_key_signature cmp_wk_arr[] = { * * - For a field containing NULL, the value is 0, and we rely on * mp_class comparison rules for arranging nullable fields. + * + * Note: comparison hint only makes sense for non-multikey + * indexes. */ #define HINT_BITS (sizeof(hint_t) * CHAR_BIT) #define HINT_CLASS_BITS 4 @@ -1615,6 +1671,35 @@ tuple_hint(struct tuple *tuple, struct key_def *key_def) return field_hint(field, key_def->parts->coll); } +static hint_t +key_hint_multikey(const char *key, uint32_t part_count, struct key_def *key_def) +{ + (void) key; + (void) part_count; + (void) key_def; + /* + * Multikey hint for tuple is an index of the key in + * array, it always must be defined. While + * tuple_hint_multikey, tuple_compare_multikey and + * tuple_compare_with_key_multikey assume that it must + * be initialized manually(so they mustn't be called), + * the virtual method for a key makes sense. Overriding + * this method such way, we extend existend code to + * do nothing on key hint calculation an it is valid + * because it is never used(unlike tuple hint). + */ + return HINT_NONE; +} + +static hint_t +tuple_hint_multikey(struct tuple *tuple, struct key_def *key_def) +{ + (void) tuple; + (void) key_def; + unreachable(); + return HINT_NONE; +} + template static void key_def_set_hint_func(struct key_def *def) @@ -1636,6 +1721,11 @@ key_def_set_hint_func(struct key_def *def) static void key_def_set_hint_func(struct key_def *def) { + if (key_def_is_multikey(def)) { + def->key_hint = key_hint_multikey; + def->tuple_hint = tuple_hint_multikey; + return; + } switch (def->parts->type) { case FIELD_TYPE_BOOLEAN: key_def_set_hint_func(def); @@ -1714,7 +1804,7 @@ key_def_set_compare_func_fast(struct key_def *def) tuple_compare_slowpath; cmp_hinted = is_sequential ? tuple_compare_sequential_hinted : - tuple_compare_slowpath_hinted; + tuple_compare_slowpath_hinted; } if (cmp_wk == NULL) { cmp_wk = is_sequential ? @@ -1722,7 +1812,8 @@ key_def_set_compare_func_fast(struct key_def *def) tuple_compare_with_key_slowpath; cmp_wk_hinted = is_sequential ? tuple_compare_with_key_sequential_hinted : - tuple_compare_with_key_slowpath_hinted; + tuple_compare_with_key_slowpath_hinted; } def->tuple_compare = cmp; @@ -1750,12 +1841,13 @@ key_def_set_compare_func_plain(struct key_def *def) def->tuple_compare = tuple_compare_slowpath ; def->tuple_compare_hinted = tuple_compare_slowpath_hinted - ; + ; def->tuple_compare_with_key = tuple_compare_with_key_slowpath ; def->tuple_compare_with_key_hinted = tuple_compare_with_key_slowpath_hinted - ; + ; } } @@ -1764,15 +1856,25 @@ static void key_def_set_compare_func_json(struct key_def *def) { assert(def->has_json_paths); - def->tuple_compare = tuple_compare_slowpath - ; - def->tuple_compare_hinted = tuple_compare_slowpath_hinted - ; - def->tuple_compare_with_key = tuple_compare_with_key_slowpath - ; - def->tuple_compare_with_key_hinted = - tuple_compare_with_key_slowpath_hinted - ; + if (key_def_is_multikey(def)) { + def->tuple_compare = tuple_compare_multikey; + def->tuple_compare_hinted = tuple_compare_slowpath_hinted + ; + def->tuple_compare_with_key = tuple_compare_with_key_multikey; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_slowpath_hinted + ; + } else { + def->tuple_compare = tuple_compare_slowpath + ; + def->tuple_compare_hinted = tuple_compare_slowpath_hinted + ; + def->tuple_compare_with_key = tuple_compare_with_key_slowpath + ; + def->tuple_compare_with_key_hinted = + tuple_compare_with_key_slowpath_hinted + ; + } } void diff --git a/src/box/tuple_compare.h b/src/box/tuple_compare.h index 77140ca0c..8614f2320 100644 --- a/src/box/tuple_compare.h +++ b/src/box/tuple_compare.h @@ -39,6 +39,16 @@ extern "C" { struct key_def; /** + * Hints are now used for two purposes - passing the index of the + * key used in the case of multikey index and to speed up the + * comparators. + * + * Scenario I. Pass the multikey index of the key to comparator. + * In the case of multikey index arises an ambiguity: which key + * should be used in the comparison. Hints act as an non-negative + * numeric index of key to use. + * + * Scenario II. Speed up comparators. * Tuple comparison hint h(t) is such a function of tuple t that * the following conditions always hold for any pair of tuples * t1 and t2: diff --git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc index 3c49094b8..07f114946 100644 --- a/src/box/tuple_extract_key.cc +++ b/src/box/tuple_extract_key.cc @@ -115,6 +115,7 @@ tuple_extract_key_slowpath(struct tuple *tuple, struct key_def *key_def, assert(has_optional_parts == key_def->has_optional_parts); assert(contains_sequential_parts == key_def_contains_sequential_parts(key_def)); + assert(!key_def_is_multikey(key_def)); assert(mp_sizeof_nil() == 1); const char *data = tuple_data(tuple); uint32_t part_count = key_def->part_count; @@ -234,6 +235,7 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end, assert(has_json_paths == key_def->has_json_paths); assert(!has_optional_parts || key_def->is_nullable); assert(has_optional_parts == key_def->has_optional_parts); + assert(!key_def_is_multikey(key_def)); assert(mp_sizeof_nil() == 1); /* allocate buffer with maximal possible size */ char *key = (char *) region_alloc(&fiber()->gc, data_end - data); @@ -310,7 +312,7 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end, const char *src_end = field_end; if (has_json_paths && part->path != NULL) { if (tuple_go_to_path(&src, part->path, - part->path_len) != 0) { + part->path_len, -1) != 0) { /* * The path must be correct as * it has already been validated @@ -349,6 +351,7 @@ static void key_def_set_extract_func_plain(struct key_def *def) { assert(!def->has_json_paths); + assert(!key_def_is_multikey(def)); if (key_def_is_sequential(def)) { assert(contains_sequential_parts || def->part_count == 1); def->tuple_extract_key = tuple_extract_key_sequential diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c index a7c8e872f..585f2680f 100644 --- a/src/box/tuple_format.c +++ b/src/box/tuple_format.c @@ -152,12 +152,14 @@ tuple_field_new(void) field->offset_slot = TUPLE_OFFSET_SLOT_NIL; field->coll_id = COLL_NONE; field->nullable_action = ON_CONFLICT_ACTION_NONE; + field->multikey_required_fields = NULL; return field; } static void tuple_field_delete(struct tuple_field *field) { + free(field->multikey_required_fields); free(field); } @@ -194,6 +196,50 @@ tuple_format_field_by_id(struct tuple_format *format, uint32_t id) return NULL; } +/** + * Check if child_field may be attached to parent_field, + * update the parent_field container type if required. + */ +static int +tuple_field_ensure_child_compatibility(struct tuple_field *parent, + struct tuple_field *child) +{ + enum field_type expected_type = + child->token.type == JSON_TOKEN_STR ? + FIELD_TYPE_MAP : FIELD_TYPE_ARRAY; + if (field_type1_contains_type2(parent->type, expected_type)) { + parent->type = expected_type; + } else { + diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH, + tuple_field_path(parent), + field_type_strs[parent->type], + field_type_strs[expected_type]); + return -1; + } + /* + * Attempt to append JSON_TOKEN_ANY leaf to parent that + * has other child records already i.e. is a intermediate + * field of non-multikey JSON index. + */ + bool is_not_multikey_parent_conflict = + child->token.type == JSON_TOKEN_ANY && + !json_token_is_multikey(&parent->token) && + !json_token_is_leaf(&parent->token); + /* + * Attempt to append not JSON_TOKEN_ANY child record to + * the parent defined as multikey index root. + */ + bool is_multikey_parent_conflict = + json_token_is_multikey(&parent->token) && + child->token.type != JSON_TOKEN_ANY; + if (is_not_multikey_parent_conflict || is_multikey_parent_conflict) { + diag_set(ClientError, ER_MULTIKEY_INDEX_MISMATCH, + tuple_field_path(parent)); + return -1; + } + return 0; +} + /** * Given a field number and a path, add the corresponding field * to the tuple format, allocating intermediate fields if @@ -215,7 +261,7 @@ tuple_format_add_field(struct tuple_format *format, uint32_t fieldno, goto end; field = tuple_field_new(); if (field == NULL) - goto fail; + return NULL; /* * Retrieve path_len memory chunk from the path_pool and @@ -234,23 +280,8 @@ tuple_format_add_field(struct tuple_format *format, uint32_t fieldno, json_lexer_create(&lexer, path, path_len, TUPLE_INDEX_BASE); while ((rc = json_lexer_next_token(&lexer, &field->token)) == 0 && field->token.type != JSON_TOKEN_END) { - if (field->token.type == JSON_TOKEN_ANY) { - diag_set(ClientError, ER_UNSUPPORTED, - "Tarantool", "multikey indexes"); + if (tuple_field_ensure_child_compatibility(parent, field) != 0) goto fail; - } - enum field_type expected_type = - field->token.type == JSON_TOKEN_STR ? - FIELD_TYPE_MAP : FIELD_TYPE_ARRAY; - if (field_type1_contains_type2(parent->type, expected_type)) { - parent->type = expected_type; - } else { - diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH, - tuple_field_path(parent), - field_type_strs[parent->type], - field_type_strs[expected_type]); - goto fail; - } struct tuple_field *next = json_tree_lookup_entry(tree, &parent->token, &field->token, @@ -268,6 +299,18 @@ tuple_format_add_field(struct tuple_format *format, uint32_t fieldno, if (field == NULL) goto fail; } + if (json_token_is_multikey(&parent->token) && + parent->offset_slot == TUPLE_OFFSET_SLOT_NIL) { + /** + * Allocate offset slot for array is used + * in multikey index. This is required to + * quickly extract its size. + * @see tuple_multikey_count() + */ + assert(parent->type == FIELD_TYPE_ARRAY); + *current_slot = *current_slot - 1; + parent->offset_slot = *current_slot; + } parent->is_key_part = true; parent = next; token_count++; @@ -280,7 +323,6 @@ tuple_format_add_field(struct tuple_format *format, uint32_t fieldno, assert(parent != NULL); /* Update tree depth information. */ format->fields_depth = MAX(format->fields_depth, token_count + 1); -cleanup: tuple_field_delete(field); end: /* @@ -288,15 +330,16 @@ end: * fields of non-sequential keys. First field is always * simply accessible, so we don't store an offset for it. */ - if (parent != NULL && parent->offset_slot == TUPLE_OFFSET_SLOT_NIL && + if (parent->offset_slot == TUPLE_OFFSET_SLOT_NIL && is_sequential == false && (fieldno > 0 || path != NULL)) { *current_slot = *current_slot - 1; parent->offset_slot = *current_slot; } return parent; fail: - parent = NULL; - goto cleanup; + if (field != NULL) + tuple_field_delete(field); + return NULL; } static int @@ -454,6 +497,34 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys, if (json_token_is_leaf(&field->token) && !tuple_field_is_nullable(field)) bit_set(format->required_fields, field->id); + /* + * When the field represents array index + * placeholder [*] (JSON_TOKEN_ANY), it requires + * own multikey_required_fields allocation, that + * must be initialized by bypassing the JSON + * subtree of this field. + */ + if (field->token.type == JSON_TOKEN_ANY) { + assert(field->multikey_required_fields == NULL); + void *multikey_required_fields = + calloc(1, required_fields_sz); + if (multikey_required_fields == NULL) { + diag_set(OutOfMemory, required_fields_sz, + "malloc", "required field bitmap"); + return -1; + } + struct tuple_field *child; + json_tree_foreach_entry_preorder(child, &field->token, + struct tuple_field, token) { + if (json_token_is_leaf(&child->token) && + !tuple_field_is_nullable(child)) { + bit_set(multikey_required_fields, + child->id); + } + } + field->multikey_required_fields = + multikey_required_fields; + } } format->hash = tuple_format_hash(format); return 0; @@ -770,6 +841,32 @@ tuple_format1_can_store_format2_tuples(struct tuple_format *format1, return true; } +/** + * Scan required_fields bitmap and raise error when it is + * non-empty. + * @sa format:required_fields and field:multikey_required_fields + * definition. + */ +static int +tuple_format_required_fields_validate(struct tuple_format *format, + void *required_fields, + uint32_t required_fields_sz) +{ + struct bit_iterator it; + bit_iterator_init(&it, required_fields, required_fields_sz, true); + size_t id = bit_iterator_next(&it); + if (id < SIZE_MAX) { + /* A field is missing, report an error. */ + struct tuple_field *field = + tuple_format_field_by_id(format, id); + assert(field != NULL); + diag_set(ClientError, ER_FIELD_MISSING, + tuple_field_path(field)); + return -1; + } + return 0; +} + /** @sa declaration for details. */ int tuple_field_map_create(struct tuple_format *format, const char *tuple, @@ -801,18 +898,30 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, * Allocate a field bitmap that will be used for checking * that all mandatory fields are present. */ - void *required_fields = NULL; + void *required_fields = NULL, *multikey_required_fields = NULL; size_t required_fields_sz = 0; if (validate) { required_fields_sz = bitmap_size(format->total_field_count); - required_fields = region_alloc(region, required_fields_sz); + required_fields = region_alloc(region, 2 * required_fields_sz); if (required_fields == NULL) { - diag_set(OutOfMemory, required_fields_sz, + diag_set(OutOfMemory, 2 * required_fields_sz, "region", "required field bitmap"); return -1; } memcpy(required_fields, format->required_fields, required_fields_sz); + /* + * Multikey index definition with non-nullable + * part can't be verified with required_fields + * because some tuple members may not store + * required key, but required_flags may be reset. + * Additional multikey_required_fields allocation + * allows to check each multikey index on finish + * decoding of the corresponding format subtree. + */ + multikey_required_fields = + (char *)required_fields + required_fields_sz; + memset(multikey_required_fields, 0, required_fields_sz); } /* * Initialize the tuple field map and validate field types. @@ -828,10 +937,34 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, const char *pos_end; struct tuple_field *field; - while (tuple_format_iterator_next(&it, false, &field, &pos, - &pos_end) != TUPLE_FORMAT_ITERATOR_STOP) { + enum tuple_format_iterator_status rc; + while ((rc = tuple_format_iterator_next(&it, false, &field, &pos, + &pos_end)) != TUPLE_FORMAT_ITERATOR_STOP) { + if (rc == TUPLE_FORMAT_ITERATOR_MULTIKEY_POP) { + /* + * Processing of next multikey index + * format:subtree has finished, test if + * all required fields are present. + */ + if (tuple_format_required_fields_validate(format, + multikey_required_fields, + required_fields_sz) != 0) + return -1; + continue; + } if (field == NULL) continue; + if (validate && field->token.type == JSON_TOKEN_ANY) { + /** + * Start processing of next multikey + * index element. Reset required_fields + * bitmap. + */ + assert(it.multikey_frame != NULL); + memcpy(multikey_required_fields, + field->multikey_required_fields, + required_fields_sz); + } /* * Check if field mp_type is compatible with type * defined in format. @@ -846,32 +979,31 @@ tuple_field_map_create(struct tuple_format *format, const char *tuple, return -1; } /* Initialize field_map with data offset. */ - if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) { + if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL && + it.multikey_frame != NULL && it.multikey_frame->idx >= 0) { + if (field_map_builder_set_slot_multikey(builder, + field->offset_slot, pos - tuple, + it.multikey_frame->idx, + it.multikey_frame->count, region) != 0) + return -1; + } else if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) { field_map_builder_set_slot(builder, field->offset_slot, pos - tuple); } - /* Mark this field as present in the tuple. */ - if (required_fields != NULL) + if (validate) { + if (it.multikey_frame != NULL) + bit_clear(multikey_required_fields, field->id); bit_clear(required_fields, field->id); + } } finish: /* * Check the required field bitmap for missing fields. */ - if (required_fields != NULL) { - struct bit_iterator it; - bit_iterator_init(&it, required_fields, - required_fields_sz, true); - size_t id = bit_iterator_next(&it); - if (id < SIZE_MAX) { - /* A field is missing, report an error. */ - field = tuple_format_field_by_id(format, id); - assert(field != NULL); - diag_set(ClientError, ER_FIELD_MISSING, - tuple_field_path(field)); - return -1; - } - } + if (validate && + tuple_format_required_fields_validate(format, required_fields, + required_fields_sz) != 0) + return -1; return 0; } @@ -954,6 +1086,7 @@ tuple_format_iterator_create(struct tuple_format_iterator *it, it->parent = &format->fields.root; it->format = format; it->pos = field; + it->multikey_frame = NULL; if (field_count == 0) { mp_stack_create(&it->stack, 0, NULL); return 0; diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h index 0b78d9995..343df2d2d 100644 --- a/src/box/tuple_format.h +++ b/src/box/tuple_format.h @@ -117,6 +117,13 @@ struct tuple_field { struct coll *coll; /** Collation identifier. */ uint32_t coll_id; + /** + * Bitmap of fields that must be present in a tuple + * conforming to the multikey subtree. Not NULL only + * when tuple_field:token.type == JSON_TOKEN_ANY. + * Indexed by tuple_field::id. + */ + void *multikey_required_fields; /** Link in tuple_format::fields. */ struct json_token token; }; @@ -170,7 +177,9 @@ struct tuple_format { */ bool is_ephemeral; /** - * Size of field map of tuple in bytes. + * Size of minimal field map of tuple where each indexed + * field has own offset slot (in bytes). The real tuple + * field_map may be bigger in case of multikey indexes. * \sa struct field_map_builder */ uint16_t field_map_size; @@ -433,6 +442,12 @@ struct tuple_format_iterator { * pointer accordingly. */ struct mp_stack stack; + /** + * The pointer to the stack frame representing an array + * filed that has JSON_TOKEN_ANY child, i.e. the root + * of the multikey index. + */ + struct mp_frame *multikey_frame; /** The current read position in msgpack. */ const char *pos; }; @@ -455,6 +470,7 @@ tuple_format_iterator_create(struct tuple_format_iterator *it, enum tuple_format_iterator_status { TUPLE_FORMAT_ITERATOR_STOP, TUPLE_FORMAT_ITERATOR_NEXT, + TUPLE_FORMAT_ITERATOR_MULTIKEY_POP, }; /** @@ -490,7 +506,22 @@ tuple_format_iterator_next(struct tuple_format_iterator *it, bool key_parts_only if (mp_stack_is_empty(&it->stack)) return TUPLE_FORMAT_ITERATOR_STOP; frame = mp_stack_top(&it->stack); + if (json_token_is_multikey(it->parent)) { + /* + * All multikey index entries have been + * processed. Reset pointer to the + * corresponding multikey frame. + */ + it->multikey_frame = NULL; + } it->parent = it->parent->parent; + if (json_token_is_multikey(it->parent)) { + /* + * Interrupt on finishing multikey index + * entry processing. + */ + return TUPLE_FORMAT_ITERATOR_MULTIKEY_POP; + } } /* * Use the top frame of the stack and the @@ -542,6 +573,19 @@ tuple_format_iterator_next(struct tuple_format_iterator *it, bool key_parts_only mp_decode_array(&it->pos) : mp_decode_map(&it->pos); mp_stack_push(&it->stack, type, size); + if (json_token_is_multikey(&(*field)->token)) { + /** + * Keep a pointer to the frame that + * describes an array with index + * placeholder [*]. The "current" item + * of this frame matches the logical + * index of item in multikey index + * and is equal to multikey index + * comparison hint. + */ + assert(type == MP_ARRAY); + it->multikey_frame = mp_stack_top(&it->stack); + } it->parent = &(*field)->token; } else { mp_next(&it->pos); diff --git a/src/box/vinyl.c b/src/box/vinyl.c index 41918beba..f63383f87 100644 --- a/src/box/vinyl.c +++ b/src/box/vinyl.c @@ -701,6 +701,11 @@ vinyl_space_check_index_def(struct space *space, struct index_def *index_def) return -1; } } + if (key_def_is_multikey(index_def->key_def)) { + diag_set(ClientError, ER_UNSUPPORTED, + "Vinyl", "multikey indexes"); + return -1; + } return 0; } @@ -1018,6 +1023,8 @@ vinyl_index_def_change_requires_rebuild(struct index *index, TUPLE_INDEX_BASE) != 0) return true; } + assert(key_def_is_multikey(old_cmp_def) == + key_def_is_multikey(new_cmp_def)); return false; } diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c index 9db6ffcf1..93267236e 100644 --- a/src/box/vy_stmt.c +++ b/src/box/vy_stmt.c @@ -440,8 +440,10 @@ vy_stmt_new_surrogate_delete_raw(struct tuple_format *format, char *pos = mp_encode_array(data, field_count); const char *src_pos_end; struct tuple_field *field; - while (tuple_format_iterator_next(&it, true, &field, &src_pos, - &src_pos_end) != TUPLE_FORMAT_ITERATOR_STOP) { + enum tuple_format_iterator_status rc; + while ((rc = tuple_format_iterator_next(&it, true, &field, &src_pos, + &src_pos_end)) != TUPLE_FORMAT_ITERATOR_STOP) { + assert(rc != TUPLE_FORMAT_ITERATOR_MULTIKEY_POP); struct mp_frame *frame = mp_stack_top(&it.stack); if (field == NULL) { /* diff --git a/test/box/misc.result b/test/box/misc.result index a1f7a0990..a2ee3ad9f 100644 --- a/test/box/misc.result +++ b/test/box/misc.result @@ -522,6 +522,7 @@ t; 191: box.error.SQL_PARSER_LIMIT 192: box.error.INDEX_DEF_UNSUPPORTED 193: box.error.CK_DEF_UNSUPPORTED + 194: box.error.MULTIKEY_INDEX_MISMATCH ... test_run:cmd("setopt delimiter ''"); --- diff --git a/test/engine/engine.cfg b/test/engine/engine.cfg index 9f07629b4..8f34bae88 100644 --- a/test/engine/engine.cfg +++ b/test/engine/engine.cfg @@ -2,6 +2,9 @@ "*": { "memtx": {"engine": "memtx"}, "vinyl": {"engine": "vinyl"} + }, + "multikey.test.lua": { + "memtx": {"engine": "memtx"} } } diff --git a/test/engine/json.result b/test/engine/json.result index 84b1309e1..e5a36a8f5 100644 --- a/test/engine/json.result +++ b/test/engine/json.result @@ -722,16 +722,3 @@ s:delete(5) s:drop() --- ... --- --- gh-1260: Multikey indexes --- -s = box.schema.space.create('withdata') ---- -... -idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) ---- -- error: Tarantool does not support multikey indexes -... -s:drop() ---- -... diff --git a/test/engine/json.test.lua b/test/engine/json.test.lua index e864ec14d..592706e0e 100644 --- a/test/engine/json.test.lua +++ b/test/engine/json.test.lua @@ -206,10 +206,3 @@ idx0 = s:create_index('idx2', {parts = {{3, 'str', path = '[1].fname'}, {3, 'str s:insert({5, {1, 1, 1}, {{fname='A', sname='B'}, {fname='C', sname='D'}, {fname='A', sname='B'}}}) s:delete(5) s:drop() - --- --- gh-1260: Multikey indexes --- -s = box.schema.space.create('withdata') -idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) -s:drop() diff --git a/test/engine/multikey.result b/test/engine/multikey.result new file mode 100644 index 000000000..293e27f1c --- /dev/null +++ b/test/engine/multikey.result @@ -0,0 +1,789 @@ +test_run = require('test_run').new() +--- +... +engine = test_run:get_cfg('engine') +--- +... +-- +-- gh-1260: Multikey indexes. +-- +s = box.schema.space.create('withdata', {engine = 'vinyl'}) +--- +... +pk = s:create_index('pk') +--- +... +-- Vinyl's space can't be multikey (yet). +_ = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) +--- +- error: Vinyl does not support multikey indexes +... +s:drop() +--- +... +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +-- Primary index must be unique so it can't be multikey. +_ = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}}}) +--- +- error: 'Can''t create or modify index ''idx'' in space ''withdata'': primary key + cannot be multikey' +... +pk = s:create_index('pk') +--- +... +-- Only tree index type may be mutlikey. +_ = s:create_index('idx', {type = 'hash', unique = true, parts = {{3, 'str', path = '[*].fname'}}}) +--- +- error: 'Can''t create or modify index ''idx'' in space ''withdata'': HASH index + cannot be multikey' +... +_ = s:create_index('idx', {type = 'bitset', unique = false, parts = {{3, 'str', path = '[*].fname'}}}) +--- +- error: 'Can''t create or modify index ''idx'' in space ''withdata'': BITSET index + cannot be multikey' +... +_ = s:create_index('idx', {type = 'rtree', unique = false, parts = {{3, 'array', path = '[*].fname'}}}) +--- +- error: 'Can''t create or modify index ''idx'' in space ''withdata'': RTREE index + cannot be multikey' +... +-- Test incompatible multikey index parts. +_ = s:create_index('idx3', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '["data"][*].sname'}}}) +--- +- error: 'Wrong index options (field 2): incompatable multikey index path' +... +_ = s:create_index('idx2', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname[*].a'}}}) +--- +- error: 'Wrong index options (field 2): no more than one array index placeholder + [*] is allowed in JSON path' +... +idx0 = s:create_index('idx0', {parts = {{3, 'str', path = '[1].fname'}}}) +--- +... +_ = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) +--- +- error: Field 3 is used as multikey in one index and as single key in another +... +idx0:drop() +--- +... +-- Unique multikey index. +idx = s:create_index('idx', {unique = true, parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) +--- +... +_ = s:create_index('idx2', {parts = {{3, 'str', path = '[1].fname'}, {3, 'str', path = '[1].sname'}}}) +--- +- error: Field 3 is used as multikey in one index and as single key in another +... +s:insert({1, {1, 2, 3}, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}}) +--- +- [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({2, {3, 4, 5}, {{fname='Ivan', sname='Ivanych'}}}) +--- +- [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] +... +_ = s:create_index('arr_idx', {unique = true, parts = {{2, 'unsigned', path = '[*]'}}}) +--- +- error: Duplicate key exists in unique index 'arr_idx' in space 'withdata' +... +-- Non-unique multikey index; two multikey indexes per space. +arr_idx = s:create_index('arr_idx', {unique = false, parts = {{2, 'unsigned', path = '[*]'}}}) +--- +... +arr_idx:select() +--- +- - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] +... +idx:get({'James', 'Bond'}) +--- +- [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +idx:get({'Ivan', 'Ivanych'}) +--- +- [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] +... +idx:get({'Vasya', 'Pupkin'}) +--- +- [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +idx:select() +--- +- - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({3, {1, 2}, {{fname='Vasya', sname='Pupkin'}}}) +--- +- error: Duplicate key exists in unique index 'idx' in space 'withdata' +... +s:insert({4, {1}, {{fname='James', sname='Bond'}}}) +--- +- error: Duplicate key exists in unique index 'idx' in space 'withdata' +... +idx:select() +--- +- - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +-- Duplicates in multikey parts. +s:insert({5, {1, 1, 1}, {{fname='A', sname='B'}, {fname='C', sname='D'}, {fname='A', sname='B'}}}) +--- +- [5, [1, 1, 1], [{'fname': 'A', 'sname': 'B'}, {'fname': 'C', 'sname': 'D'}, {'fname': 'A', + 'sname': 'B'}]] +... +arr_idx:select({1}) +--- +- - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [5, [1, 1, 1], [{'fname': 'A', 'sname': 'B'}, {'fname': 'C', 'sname': 'D'}, { + 'fname': 'A', 'sname': 'B'}]] +... +s:delete(5) +--- +- [5, [1, 1, 1], [{'fname': 'A', 'sname': 'B'}, {'fname': 'C', 'sname': 'D'}, {'fname': 'A', + 'sname': 'B'}]] +... +-- Check that there is no garbage in index. +arr_idx:select({1}) +--- +- - [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +idx:get({'A', 'B'}) +--- +... +idx:get({'C', 'D'}) +--- +... +idx:delete({'Vasya', 'Pupkin'}) +--- +- [1, [1, 2, 3], [{'fname': 'James', 'sname': 'Bond'}, {'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({6, {1, 2}, {{fname='Vasya', sname='Pupkin'}}}) +--- +- [6, [1, 2], [{'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +s:insert({7, {1}, {{fname='James', sname='Bond'}}}) +--- +- [7, [1], [{'fname': 'James', 'sname': 'Bond'}]] +... +arr_idx:select({1}) +--- +- - [6, [1, 2], [{'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [7, [1], [{'fname': 'James', 'sname': 'Bond'}]] +... +idx:select() +--- +- - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [7, [1], [{'fname': 'James', 'sname': 'Bond'}]] + - [6, [1, 2], [{'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +-- Snapshot & recovery. +box.snapshot() +--- +- ok +... +test_run:cmd("restart server default") +s = box.space["withdata"] +--- +... +idx = s.index["idx"] +--- +... +arr_idx = s.index["arr_idx"] +--- +... +s:select() +--- +- - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [6, [1, 2], [{'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [7, [1], [{'fname': 'James', 'sname': 'Bond'}]] +... +idx:select() +--- +- - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [7, [1], [{'fname': 'James', 'sname': 'Bond'}]] + - [6, [1, 2], [{'fname': 'Vasya', 'sname': 'Pupkin'}]] +... +arr_idx:select() +--- +- - [6, [1, 2], [{'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [7, [1], [{'fname': 'James', 'sname': 'Bond'}]] + - [6, [1, 2], [{'fname': 'Vasya', 'sname': 'Pupkin'}]] + - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] + - [2, [3, 4, 5], [{'fname': 'Ivan', 'sname': 'Ivanych'}]] +... +s:drop() +--- +... +-- Assymetric multikey index paths. +test_run = require('test_run').new() +--- +... +engine = test_run:get_cfg('engine') +--- +... +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}}) +--- +... +s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}}) +--- +- [1, 1, [{'fname': 'A1', 'extra': {'sname': 'A2'}}, {'fname': 'B1'}, {'fname': 'C1', + 'extra': {'sname': 'C2'}}]] +... +s:drop() +--- +... +-- Unique multikey index features. +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +idx0 = s:create_index('idx0', {parts = {{2, 'int', path = '[*]'}}}) +--- +... +s:insert({1, {1, 1, 1}}) +--- +- [1, [1, 1, 1]] +... +s:insert({2, {2, 2}}) +--- +- [2, [2, 2]] +... +s:insert({3, {3, 3, 2, 2, 1, 1}}) +--- +- error: Duplicate key exists in unique index 'idx0' in space 'withdata' +... +idx0:get(2) +--- +- [2, [2, 2]] +... +idx0:get(1) +--- +- [1, [1, 1, 1]] +... +idx0:get(3) +--- +... +idx0:select() +--- +- - [1, [1, 1, 1]] + - [2, [2, 2]] +... +idx0:delete(2) +--- +- [2, [2, 2]] +... +idx0:get(2) +--- +... +idx0:select() +--- +- - [1, [1, 1, 1]] +... +s:drop() +--- +... +-- Test user JSON endpoint doesn't fail in case of multikey index. +t = box.tuple.new{1, {a = 1, b = 2}, 3} +--- +... +t['[2][*]'] +--- +- null +... +-- Test multikey index deduplication on recovery. +test_run = require('test_run').new() +--- +... +engine = test_run:get_cfg('engine') +--- +... +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +idx0 = s:create_index('idx0', {parts = {{2, 'int', path = '[*]', is_nullable = true}}}) +--- +... +s:insert({1, {1, 1, 3, 1}}) +--- +- [1, [1, 1, 3, 1]] +... +idx0:select() +--- +- - [1, [1, 1, 3, 1]] + - [1, [1, 1, 3, 1]] +... +s:replace({1, {5, 5, 5, 1, 3, 3}}) +--- +- [1, [5, 5, 5, 1, 3, 3]] +... +idx0:select() +--- +- - [1, [5, 5, 5, 1, 3, 3]] + - [1, [5, 5, 5, 1, 3, 3]] + - [1, [5, 5, 5, 1, 3, 3]] +... +-- Test update. +s:update(1, {{'=', 2, {20, 10, 30, 30}}}) +--- +- [1, [20, 10, 30, 30]] +... +idx0:select() +--- +- - [1, [20, 10, 30, 30]] + - [1, [20, 10, 30, 30]] + - [1, [20, 10, 30, 30]] +... +box.snapshot() +--- +- ok +... +test_run:cmd("restart server default") +s = box.space.withdata +--- +... +s:select() +--- +- - [1, [20, 10, 30, 30]] +... +idx0 = s.index.idx0 +--- +... +idx0:select() +--- +- - [1, [20, 10, 30, 30]] + - [1, [20, 10, 30, 30]] + - [1, [20, 10, 30, 30]] +... +s:insert({2, {2, 4, 2}}) +--- +- [2, [2, 4, 2]] +... +s:select() +--- +- - [1, [20, 10, 30, 30]] + - [2, [2, 4, 2]] +... +idx0:select() +--- +- - [2, [2, 4, 2]] + - [2, [2, 4, 2]] + - [1, [20, 10, 30, 30]] + - [1, [20, 10, 30, 30]] + - [1, [20, 10, 30, 30]] +... +s:drop() +--- +... +-- Test upsert & reverse iterators. +test_run = require('test_run').new() +--- +... +engine = test_run:get_cfg('engine') +--- +... +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +idx0 = s:create_index('idx0', {unique=false, parts = {{2, 'int', path = '[*]'}}}) +--- +... +s:insert({1, {1, 1, 1, 2, 2}}) +--- +- [1, [1, 1, 1, 2, 2]] +... +s:insert({2, {3, 3, 4, 4, 4}}) +--- +- [2, [3, 3, 4, 4, 4]] +... +s:insert({3, {5, 5, 5, 5, 6, 6, 6, 6}}) +--- +- [3, [5, 5, 5, 5, 6, 6, 6, 6]] +... +idx0:select(5, {iterator = box.index.LT}) +--- +- - [2, [3, 3, 4, 4, 4]] + - [2, [3, 3, 4, 4, 4]] + - [1, [1, 1, 1, 2, 2]] + - [1, [1, 1, 1, 2, 2]] +... +idx0:select(5, {iterator = box.index.GE}) +--- +- - [3, [5, 5, 5, 5, 6, 6, 6, 6]] + - [3, [5, 5, 5, 5, 6, 6, 6, 6]] +... +idx0:select() +--- +- - [1, [1, 1, 1, 2, 2]] + - [1, [1, 1, 1, 2, 2]] + - [2, [3, 3, 4, 4, 4]] + - [2, [3, 3, 4, 4, 4]] + - [3, [5, 5, 5, 5, 6, 6, 6, 6]] + - [3, [5, 5, 5, 5, 6, 6, 6, 6]] +... +s:upsert({3, {5, 5, 5, 5, 6, 6, 6, 6}}, {{'=', 2, {1, 1, 2, 2, 3, 3, 4, 4}}}) +--- +... +idx0:select() +--- +- - [1, [1, 1, 1, 2, 2]] + - [3, [1, 1, 2, 2, 3, 3, 4, 4]] + - [1, [1, 1, 1, 2, 2]] + - [3, [1, 1, 2, 2, 3, 3, 4, 4]] + - [2, [3, 3, 4, 4, 4]] + - [3, [1, 1, 2, 2, 3, 3, 4, 4]] + - [2, [3, 3, 4, 4, 4]] + - [3, [1, 1, 2, 2, 3, 3, 4, 4]] +... +s:drop() +--- +... +-- Test multikey index with epsent data (1 part, is_nullable = false). +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +idx0 = s:create_index('idx0', {parts = {{2, 'str', path = '[*].name'}}}) +--- +... +s:insert({0, {{fname='A0'}, {fname='B0'}, {fname='C0'}}}) +--- +- error: Tuple field [2][*]["name"] required by space format is missing +... +s:insert({0, {{fname='A0'}, {fname='B0', name='ZB1'}, {fname='C0'}}}) +--- +- error: Tuple field [2][*]["name"] required by space format is missing +... +s:insert({0, {{fname='A0', name='ZA1'}, {fname='B0', name='ZB1'}, {fname='C0'}}}) +--- +- error: Tuple field [2][*]["name"] required by space format is missing +... +s:insert({0, {{fname='A0', name='ZA1'}, {fname='B0', name='ZB1'}, {fname='C0', name='ZC1'}}}) +--- +- [0, [{'name': 'ZA1', 'fname': 'A0'}, {'name': 'ZB1', 'fname': 'B0'}, {'name': 'ZC1', + 'fname': 'C0'}]] +... +s:drop() +--- +... +-- Test multikey replacement conflict. +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +idx0 = s:create_index('idx0', {parts = {{2, 'str', path = '[*].name', is_nullable=true}}}) +--- +... +s:insert({0, {{fname='A0'}, {fname='B0'}, {fname='C0', name='CONFLICT1'}}}) +--- +- [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'name': 'CONFLICT1', 'fname': 'C0'}]] +... +s:insert({1, {{fname='A1'}, {fname='B1'}, {fname='C1', name='CONFLICT2'}}}) +--- +- [1, [{'fname': 'A1'}, {'fname': 'B1'}, {'name': 'CONFLICT2', 'fname': 'C1'}]] +... +s:insert({2, {{fname='A2_1'}, {fname='B2_1', name='ZB2_1'}, {fname='C2_1'}, {name="DUP"}, {name="DUP"}}}) +--- +- [2, [{'fname': 'A2_1'}, {'name': 'ZB2_1', 'fname': 'B2_1'}, {'fname': 'C2_1'}, { + 'name': 'DUP'}, {'name': 'DUP'}]] +... +s:replace({2, {{fname='A2_2'}, {fname='B2_2', name='ZB2_2'}, {fname='C2_2'}, {name="DUP"}, {name="DUP"}}}) +--- +- [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, { + 'name': 'DUP'}, {'name': 'DUP'}]] +... +s:replace({2, {{fname='A2_3'}, {fname='B2_3', name='ZB2_3'}, {fname='C2_3'}, {name="DUP"}, {name='CONFLICT1'}, {name='CONFLICT2'}, {name="DUP"}}}) +--- +- error: Duplicate key exists in unique index 'idx0' in space 'withdata' +... +s:replace({2, {{fname='A2_4'}, {fname='B2_4', name='ZB2_4'}, {fname='C2_4'}, {name="DUP"}, {name='CONFLICT2'}, {name='CONFLICT1'}, {name="DUP"}}}) +--- +- error: Duplicate key exists in unique index 'idx0' in space 'withdata' +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'name': 'CONFLICT1', 'fname': 'C0'}]] + - [1, [{'fname': 'A1'}, {'fname': 'B1'}, {'name': 'CONFLICT2', 'fname': 'C1'}]] + - [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'name': 'CONFLICT1', 'fname': 'C0'}]] + - [1, [{'fname': 'A1'}, {'fname': 'B1'}, {'name': 'CONFLICT2', 'fname': 'C1'}]] + - [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, + {'name': 'DUP'}, {'name': 'DUP'}]] +... +box.snapshot() +--- +- ok +... +test_run:cmd("restart server default") +s = box.space.withdata +--- +... +idx0 = s.index.idx0 +--- +... +s:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'name': 'CONFLICT1', 'fname': 'C0'}]] + - [1, [{'fname': 'A1'}, {'fname': 'B1'}, {'name': 'CONFLICT2', 'fname': 'C1'}]] + - [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, + {'name': 'DUP'}, {'name': 'DUP'}]] +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'name': 'CONFLICT1', 'fname': 'C0'}]] + - [1, [{'fname': 'A1'}, {'fname': 'B1'}, {'name': 'CONFLICT2', 'fname': 'C1'}]] + - [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'name': 'CONFLICT1', 'fname': 'C0'}]] + - [1, [{'fname': 'A1'}, {'fname': 'B1'}, {'name': 'CONFLICT2', 'fname': 'C1'}]] + - [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2_2'}, {'name': 'ZB2_2', 'fname': 'B2_2'}, {'fname': 'C2_2'}, + {'name': 'DUP'}, {'name': 'DUP'}]] +... +s:drop() +--- +... +-- Test multikey index with epsent data (2 parts, is_nullable = true). +test_run = require('test_run').new() +--- +... +engine = test_run:get_cfg('engine') +--- +... +s = box.schema.space.create('withdata', {engine = engine}) +--- +... +pk = s:create_index('pk') +--- +... +idx0 = s:create_index('idx0', {unique=false, parts = {{2, 'str', path = '[*].name', is_nullable=true}}}) +--- +... +-- No idx0 index data at all. +s:insert({0, {{fname='A0'}, {fname='B0'}, {fname='C0'}}}) +--- +- [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] +... +-- Only one field corresponds idx0. +s:insert({1, {{fname='A1'}, {fname='B1', name='ZB1'}, {fname='C1'}}}) +--- +- [1, [{'fname': 'A1'}, {'name': 'ZB1', 'fname': 'B1'}, {'fname': 'C1'}]] +... +s:insert({2, {{fname='A2'}, {fname='B2', name='ZB2'}, {fname='C2'}, {name="DUP"}, {name="DUP"}}}) +--- +- [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [1, [{'fname': 'A1'}, {'name': 'ZB1', 'fname': 'B1'}, {'fname': 'C1'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'fname': 'A1'}, {'name': 'ZB1', 'fname': 'B1'}, {'fname': 'C1'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] +... +s:replace({1, {{fname='A3'}, {fname='B3', name='ZB3'}, {fname='C3'}}}) +--- +- [1, [{'fname': 'A3'}, {'name': 'ZB3', 'fname': 'B3'}, {'fname': 'C3'}]] +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [1, [{'fname': 'A3'}, {'name': 'ZB3', 'fname': 'B3'}, {'fname': 'C3'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'fname': 'A3'}, {'name': 'ZB3', 'fname': 'B3'}, {'fname': 'C3'}]] +... +s:replace({1, {{fname='A4', name='ZA4'}, {fname='B4', name='ZB4'}, {fname='C4', name='ZC4'}, {name="DUP"}, {name="DUP"}, {name="DUP"}}}) +--- +- [1, [{'name': 'ZA4', 'fname': 'A4'}, {'name': 'ZB4', 'fname': 'B4'}, {'name': 'ZC4', + 'fname': 'C4'}, {'name': 'DUP'}, {'name': 'DUP'}, {'name': 'DUP'}]] +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA4', 'fname': 'A4'}, {'name': 'ZB4', 'fname': 'B4'}, {'name': 'ZC4', + 'fname': 'C4'}, {'name': 'DUP'}, {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA4', 'fname': 'A4'}, {'name': 'ZB4', 'fname': 'B4'}, {'name': 'ZC4', + 'fname': 'C4'}, {'name': 'DUP'}, {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA4', 'fname': 'A4'}, {'name': 'ZB4', 'fname': 'B4'}, {'name': 'ZC4', + 'fname': 'C4'}, {'name': 'DUP'}, {'name': 'DUP'}, {'name': 'DUP'}]] + - [1, [{'name': 'ZA4', 'fname': 'A4'}, {'name': 'ZB4', 'fname': 'B4'}, {'name': 'ZC4', + 'fname': 'C4'}, {'name': 'DUP'}, {'name': 'DUP'}, {'name': 'DUP'}]] +... +s:replace({1, {{fname='A5', name='ZA5'}, {fname='B5'}, {fname='C5'}, {name="DUP"}, {name="DUP"}, {name="DUP"}}}) +--- +- [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] +... +s:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] +... +box.snapshot() +--- +- ok +... +test_run:cmd("restart server default") +s = box.space.withdata +--- +... +idx0 = s.index.idx0 +--- +... +s:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] +... +idx0:select() +--- +- - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] +... +-- Test multikey index alter. +idx0:alter({parts = {{2, 'str', path = '[*].fname', is_nullable=true}}}) +--- +... +idx0:select() +--- +- - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] + - [0, [{'fname': 'A0'}, {'fname': 'B0'}, {'fname': 'C0'}]] + - [2, [{'fname': 'A2'}, {'name': 'ZB2', 'fname': 'B2'}, {'fname': 'C2'}, {'name': 'DUP'}, + {'name': 'DUP'}]] + - [1, [{'name': 'ZA5', 'fname': 'A5'}, {'fname': 'B5'}, {'fname': 'C5'}, {'name': 'DUP'}, + {'name': 'DUP'}, {'name': 'DUP'}]] +... +s:drop() +--- +... +-- Create unique multikey index on space with tuple which +-- contains the same key multiple times. +test_run = require('test_run').new() +--- +... +engine = test_run:get_cfg('engine') +--- +... +s = box.schema.space.create('test', {engine = engine}) +--- +... +i = s:create_index('pk') +--- +... +s:replace{1, {{1, 1}}} +--- +- [1, [[1, 1]]] +... +s:replace{2, {{2, 3}}} +--- +- [2, [[2, 3]]] +... +i2 = s:create_index('sk', {parts = {{2, 'unsigned', path = '[1][*]'}}}) +--- +... +i2:select() +--- +- - [1, [[1, 1]]] + - [2, [[2, 3]]] + - [2, [[2, 3]]] +... +s:drop() +--- +... diff --git a/test/engine/multikey.test.lua b/test/engine/multikey.test.lua new file mode 100644 index 000000000..0916bf527 --- /dev/null +++ b/test/engine/multikey.test.lua @@ -0,0 +1,206 @@ +test_run = require('test_run').new() +engine = test_run:get_cfg('engine') + +-- +-- gh-1260: Multikey indexes. +-- +s = box.schema.space.create('withdata', {engine = 'vinyl'}) +pk = s:create_index('pk') +-- Vinyl's space can't be multikey (yet). +_ = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) +s:drop() + +s = box.schema.space.create('withdata', {engine = engine}) +-- Primary index must be unique so it can't be multikey. +_ = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}}}) +pk = s:create_index('pk') +-- Only tree index type may be mutlikey. +_ = s:create_index('idx', {type = 'hash', unique = true, parts = {{3, 'str', path = '[*].fname'}}}) +_ = s:create_index('idx', {type = 'bitset', unique = false, parts = {{3, 'str', path = '[*].fname'}}}) +_ = s:create_index('idx', {type = 'rtree', unique = false, parts = {{3, 'array', path = '[*].fname'}}}) +-- Test incompatible multikey index parts. +_ = s:create_index('idx3', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '["data"][*].sname'}}}) +_ = s:create_index('idx2', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname[*].a'}}}) +idx0 = s:create_index('idx0', {parts = {{3, 'str', path = '[1].fname'}}}) +_ = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) +idx0:drop() +-- Unique multikey index. +idx = s:create_index('idx', {unique = true, parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].sname'}}}) +_ = s:create_index('idx2', {parts = {{3, 'str', path = '[1].fname'}, {3, 'str', path = '[1].sname'}}}) +s:insert({1, {1, 2, 3}, {{fname='James', sname='Bond'}, {fname='Vasya', sname='Pupkin'}}}) +s:insert({2, {3, 4, 5}, {{fname='Ivan', sname='Ivanych'}}}) +_ = s:create_index('arr_idx', {unique = true, parts = {{2, 'unsigned', path = '[*]'}}}) +-- Non-unique multikey index; two multikey indexes per space. +arr_idx = s:create_index('arr_idx', {unique = false, parts = {{2, 'unsigned', path = '[*]'}}}) +arr_idx:select() +idx:get({'James', 'Bond'}) +idx:get({'Ivan', 'Ivanych'}) +idx:get({'Vasya', 'Pupkin'}) +idx:select() +s:insert({3, {1, 2}, {{fname='Vasya', sname='Pupkin'}}}) +s:insert({4, {1}, {{fname='James', sname='Bond'}}}) +idx:select() +-- Duplicates in multikey parts. +s:insert({5, {1, 1, 1}, {{fname='A', sname='B'}, {fname='C', sname='D'}, {fname='A', sname='B'}}}) +arr_idx:select({1}) +s:delete(5) +-- Check that there is no garbage in index. +arr_idx:select({1}) +idx:get({'A', 'B'}) +idx:get({'C', 'D'}) +idx:delete({'Vasya', 'Pupkin'}) +s:insert({6, {1, 2}, {{fname='Vasya', sname='Pupkin'}}}) +s:insert({7, {1}, {{fname='James', sname='Bond'}}}) +arr_idx:select({1}) +idx:select() +-- Snapshot & recovery. +box.snapshot() +test_run:cmd("restart server default") +s = box.space["withdata"] +idx = s.index["idx"] +arr_idx = s.index["arr_idx"] +s:select() +idx:select() +arr_idx:select() +s:drop() + +-- Assymetric multikey index paths. +test_run = require('test_run').new() +engine = test_run:get_cfg('engine') +s = box.schema.space.create('withdata', {engine = engine}) +pk = s:create_index('pk') +idx = s:create_index('idx', {parts = {{3, 'str', path = '[*].fname'}, {3, 'str', path = '[*].extra.sname', is_nullable = true}}}) +s:insert({1, 1, {{fname='A1', extra={sname='A2'}}, {fname='B1'}, {fname='C1', extra={sname='C2'}}}}) +s:drop() + +-- Unique multikey index features. +s = box.schema.space.create('withdata', {engine = engine}) +pk = s:create_index('pk') +idx0 = s:create_index('idx0', {parts = {{2, 'int', path = '[*]'}}}) +s:insert({1, {1, 1, 1}}) +s:insert({2, {2, 2}}) +s:insert({3, {3, 3, 2, 2, 1, 1}}) +idx0:get(2) +idx0:get(1) +idx0:get(3) +idx0:select() +idx0:delete(2) +idx0:get(2) +idx0:select() +s:drop() + +-- Test user JSON endpoint doesn't fail in case of multikey index. +t = box.tuple.new{1, {a = 1, b = 2}, 3} +t['[2][*]'] + +-- Test multikey index deduplication on recovery. +test_run = require('test_run').new() +engine = test_run:get_cfg('engine') +s = box.schema.space.create('withdata', {engine = engine}) +pk = s:create_index('pk') +idx0 = s:create_index('idx0', {parts = {{2, 'int', path = '[*]', is_nullable = true}}}) +s:insert({1, {1, 1, 3, 1}}) +idx0:select() +s:replace({1, {5, 5, 5, 1, 3, 3}}) +idx0:select() +-- Test update. +s:update(1, {{'=', 2, {20, 10, 30, 30}}}) +idx0:select() +box.snapshot() +test_run:cmd("restart server default") +s = box.space.withdata +s:select() +idx0 = s.index.idx0 +idx0:select() +s:insert({2, {2, 4, 2}}) +s:select() +idx0:select() +s:drop() + +-- Test upsert & reverse iterators. +test_run = require('test_run').new() +engine = test_run:get_cfg('engine') +s = box.schema.space.create('withdata', {engine = engine}) +pk = s:create_index('pk') +idx0 = s:create_index('idx0', {unique=false, parts = {{2, 'int', path = '[*]'}}}) +s:insert({1, {1, 1, 1, 2, 2}}) +s:insert({2, {3, 3, 4, 4, 4}}) +s:insert({3, {5, 5, 5, 5, 6, 6, 6, 6}}) +idx0:select(5, {iterator = box.index.LT}) +idx0:select(5, {iterator = box.index.GE}) +idx0:select() +s:upsert({3, {5, 5, 5, 5, 6, 6, 6, 6}}, {{'=', 2, {1, 1, 2, 2, 3, 3, 4, 4}}}) +idx0:select() +s:drop() + +-- Test multikey index with epsent data (1 part, is_nullable = false). +s = box.schema.space.create('withdata', {engine = engine}) +pk = s:create_index('pk') +idx0 = s:create_index('idx0', {parts = {{2, 'str', path = '[*].name'}}}) +s:insert({0, {{fname='A0'}, {fname='B0'}, {fname='C0'}}}) +s:insert({0, {{fname='A0'}, {fname='B0', name='ZB1'}, {fname='C0'}}}) +s:insert({0, {{fname='A0', name='ZA1'}, {fname='B0', name='ZB1'}, {fname='C0'}}}) +s:insert({0, {{fname='A0', name='ZA1'}, {fname='B0', name='ZB1'}, {fname='C0', name='ZC1'}}}) +s:drop() + +-- Test multikey replacement conflict. +s = box.schema.space.create('withdata', {engine = engine}) +pk = s:create_index('pk') +idx0 = s:create_index('idx0', {parts = {{2, 'str', path = '[*].name', is_nullable=true}}}) +s:insert({0, {{fname='A0'}, {fname='B0'}, {fname='C0', name='CONFLICT1'}}}) +s:insert({1, {{fname='A1'}, {fname='B1'}, {fname='C1', name='CONFLICT2'}}}) +s:insert({2, {{fname='A2_1'}, {fname='B2_1', name='ZB2_1'}, {fname='C2_1'}, {name="DUP"}, {name="DUP"}}}) +s:replace({2, {{fname='A2_2'}, {fname='B2_2', name='ZB2_2'}, {fname='C2_2'}, {name="DUP"}, {name="DUP"}}}) +s:replace({2, {{fname='A2_3'}, {fname='B2_3', name='ZB2_3'}, {fname='C2_3'}, {name="DUP"}, {name='CONFLICT1'}, {name='CONFLICT2'}, {name="DUP"}}}) +s:replace({2, {{fname='A2_4'}, {fname='B2_4', name='ZB2_4'}, {fname='C2_4'}, {name="DUP"}, {name='CONFLICT2'}, {name='CONFLICT1'}, {name="DUP"}}}) +idx0:select() +box.snapshot() +test_run:cmd("restart server default") +s = box.space.withdata +idx0 = s.index.idx0 +s:select() +idx0:select() +s:drop() + +-- Test multikey index with epsent data (2 parts, is_nullable = true). +test_run = require('test_run').new() +engine = test_run:get_cfg('engine') +s = box.schema.space.create('withdata', {engine = engine}) +pk = s:create_index('pk') +idx0 = s:create_index('idx0', {unique=false, parts = {{2, 'str', path = '[*].name', is_nullable=true}}}) +-- No idx0 index data at all. +s:insert({0, {{fname='A0'}, {fname='B0'}, {fname='C0'}}}) +idx0:select() +-- Only one field corresponds idx0. +s:insert({1, {{fname='A1'}, {fname='B1', name='ZB1'}, {fname='C1'}}}) +s:insert({2, {{fname='A2'}, {fname='B2', name='ZB2'}, {fname='C2'}, {name="DUP"}, {name="DUP"}}}) +idx0:select() +s:replace({1, {{fname='A3'}, {fname='B3', name='ZB3'}, {fname='C3'}}}) +idx0:select() +s:replace({1, {{fname='A4', name='ZA4'}, {fname='B4', name='ZB4'}, {fname='C4', name='ZC4'}, {name="DUP"}, {name="DUP"}, {name="DUP"}}}) +idx0:select() +s:replace({1, {{fname='A5', name='ZA5'}, {fname='B5'}, {fname='C5'}, {name="DUP"}, {name="DUP"}, {name="DUP"}}}) +idx0:select() +s:select() +box.snapshot() +test_run:cmd("restart server default") +s = box.space.withdata +idx0 = s.index.idx0 +s:select() +idx0:select() +-- Test multikey index alter. +idx0:alter({parts = {{2, 'str', path = '[*].fname', is_nullable=true}}}) +idx0:select() +s:drop() + +-- Create unique multikey index on space with tuple which +-- contains the same key multiple times. +test_run = require('test_run').new() +engine = test_run:get_cfg('engine') +s = box.schema.space.create('test', {engine = engine}) +i = s:create_index('pk') +s:replace{1, {{1, 1}}} +s:replace{2, {{2, 3}}} +i2 = s:create_index('sk', {parts = {{2, 'unsigned', path = '[1][*]'}}}) +i2:select() +s:drop() -- 2.21.0