From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id A37CE2574C for ; Sat, 31 Aug 2019 17:32:44 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id thNgrYTTg4DK for ; Sat, 31 Aug 2019 17:32:44 -0400 (EDT) Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 0BA7325BA5 for ; Sat, 31 Aug 2019 17:32:44 -0400 (EDT) From: Vladislav Shpilevoy Subject: [tarantool-patches] [PATCH v2 5/8] tuple: enable JSON bar updates Date: Sat, 31 Aug 2019 23:35:55 +0200 Message-Id: <098ebe97d3b26a24443ce1861405bb5dc00cf7e4.1567287197.git.v.shpilevoy@tarantool.org> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org Cc: kostja@tarantool.org Bar update is an update by JSON path of a tuple's internal object with no other updates along the path. It is an optimization to do not store each part of the path and link to it in memory. Bar stores just string with JSON and pointer to the MessagePack object to update. So far bar can not have another updates along the path, this patch introduces only non-intersected JSON path updates (that is actually one of the most common cases, so bar is really useful optimization). An example. There is an update {'=', 'a.b.c[1][2][3]', 100}. For this whole update only one object will be created: { op = '=', path = 'a.b.c[1][2][3]', new_value = 100, ptr_to_tuple_msgpack = 0x...., ... } The elements 'a', 'b', 'c', '[1]', '[2]', '[3]' are not a list of objects each occupying memory. The whole bar is stored as one object, and the path as a pointer to a string in MessagePack obtained from a user. This makes JSON updates memory complexity not depend on path legths. Part of #1261 --- src/box/CMakeLists.txt | 1 + src/box/tuple_update.c | 35 ++- src/box/update/update_array.c | 24 +- src/box/update/update_bar.c | 408 +++++++++++++++++++++++++++++++++ src/box/update/update_field.c | 44 +++- src/box/update/update_field.h | 102 +++++++++ src/box/vinyl.c | 17 +- test/box/update.result | 410 +++++++++++++++++++++++++++++++++- test/box/update.test.lua | 145 ++++++++++++ test/engine/update.result | 5 - test/engine/update.test.lua | 2 - test/unit/column_mask.c | 75 ++++++- test/unit/column_mask.result | 8 +- 13 files changed, 1247 insertions(+), 29 deletions(-) create mode 100644 src/box/update/update_bar.c diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt index 62737aafe..2888f4d7d 100644 --- a/src/box/CMakeLists.txt +++ b/src/box/CMakeLists.txt @@ -43,6 +43,7 @@ add_library(tuple STATIC tuple_update.c update/update_field.c update/update_array.c + update/update_bar.c tuple_compare.cc tuple_extract_key.cc tuple_hash.cc diff --git a/src/box/tuple_update.c b/src/box/tuple_update.c index 81e1f7e97..a9e4ed615 100644 --- a/src/box/tuple_update.c +++ b/src/box/tuple_update.c @@ -47,7 +47,12 @@ struct tuple_update { * is from Lua, then the base is 1. Otherwise 0. */ int index_base; - /** A bitmask of all columns modified by this update. */ + /** + * A bitmask of all columns modified by this update. Only + * the first level of a tuple is accounted here. I.e. if + * a field [1][2][3] was updated, then only [1] is + * reflected. + */ uint64_t column_mask; /** First level of update tree. It is always array. */ struct update_field root_array; @@ -106,9 +111,25 @@ update_read_ops(struct tuple_update *update, const char *expr, */ if (column_mask != COLUMN_MASK_FULL) { int32_t field_no; + char opcode; + if (update_op_is_term(op)) { + opcode = op->opcode; + } else { + /* + * When a field is not terminal, + * on the first level it is for + * sure changes only one field and + * in terms of column mask is + * equivalent to any scalar + * operation. Even if it was '!' + * or '#'. + */ + opcode = 0; + } + if (op->field_no >= 0) field_no = op->field_no; - else if (op->opcode != '!') + else if (opcode != '!') field_no = field_count_hint + op->field_no; else /* @@ -151,12 +172,12 @@ update_read_ops(struct tuple_update *update, const char *expr, * hint. It is used to translate negative * field numbers into positive ones. */ - if (op->opcode == '!') + if (opcode == '!') ++field_count_hint; - else if (op->opcode == '#') + else if (opcode == '#') field_count_hint -= (int32_t) op->arg.del.count; - if (op->opcode == '!' || op->opcode == '#') + if (opcode == '!' || opcode == '#') /* * If the operation is insertion * or deletion then it potentially @@ -331,8 +352,8 @@ tuple_upsert_squash(const char *expr1, const char *expr1_end, int32_t prev_field_no = index_base - 1; for (uint32_t i = 0; i < update[j].op_count; i++) { struct update_op *op = &update[j].ops[i]; - if (op->opcode != '+' && op->opcode != '-' && - op->opcode != '=') + if ((op->opcode != '+' && op->opcode != '-' && + op->opcode != '=') || op->lexer.src != NULL) return NULL; if (op->field_no <= prev_field_no) return NULL; diff --git a/src/box/update/update_array.c b/src/box/update/update_array.c index 5b834b644..fe50a605a 100644 --- a/src/box/update/update_array.c +++ b/src/box/update/update_array.c @@ -216,12 +216,19 @@ do_op_array_insert(struct update_op *op, struct update_field *field) { assert(field->type == UPDATE_ARRAY); struct rope *rope = field->array.rope; + struct update_array_item *item; + if (! update_op_is_term(op)) { + item = update_array_extract_item(field, op); + if (item == NULL) + return -1; + return do_op_insert(op, &item->field); + } + if (update_op_adjust_field_no(op, rope_size(rope) + 1) != 0) return -1; - struct update_array_item *item = - (struct update_array_item *) rope_alloc(rope->ctx, - sizeof(*item)); + item = (struct update_array_item *) rope_alloc(rope->ctx, + sizeof(*item)); if (item == NULL) return -1; update_array_item_create(item, UPDATE_NOP, op->arg.set.value, @@ -242,6 +249,8 @@ do_op_array_set(struct update_op *op, struct update_field *field) update_array_extract_item(field, op); if (item == NULL) return -1; + if (! update_op_is_term(op)) + return do_op_set(op, &item->field); op->new_field_len = op->arg.set.length; /* Ignore the previous op, if any. */ item->field.type = UPDATE_SCALAR; @@ -253,6 +262,13 @@ int do_op_array_delete(struct update_op *op, struct update_field *field) { assert(field->type == UPDATE_ARRAY); + if (! update_op_is_term(op)) { + struct update_array_item *item = + update_array_extract_item(field, op); + if (item == NULL) + return -1; + return do_op_delete(op, &item->field); + } struct rope *rope = field->array.rope; uint32_t size = rope_size(rope); if (update_op_adjust_field_no(op, size) != 0) @@ -273,6 +289,8 @@ do_op_array_##op_type(struct update_op *op, struct update_field *field) \ update_array_extract_item(field, op); \ if (item == NULL) \ return -1; \ + if (! update_op_is_term(op)) \ + return do_op_##op_type(op, &item->field); \ if (item->field.type != UPDATE_NOP) \ return update_err_double(op); \ if (update_op_do_##op_type(op, item->field.data) != 0) \ diff --git a/src/box/update/update_bar.c b/src/box/update/update_bar.c new file mode 100644 index 000000000..f4fc00716 --- /dev/null +++ b/src/box/update/update_bar.c @@ -0,0 +1,408 @@ +/* + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * 1. Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL + * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ +#include "update_field.h" +#include "box/tuple.h" + +/** + * Locate the field to update by JSON path in @a op->path. If + * found, initialize @a field as a bar update. + * @param op Update operation. + * @param field Field to locate in. + * + * @retval 0 Success. + * @retval -1 Not found or invalid JSON. + */ +static inline int +update_bar_locate(struct update_op *op, struct update_field *field) +{ + assert(! update_op_is_term(op)); + const char *parent = NULL, *pos = field->data; + field->bar.path = op->lexer.src + op->lexer.offset; + field->bar.path_len = op->lexer.src_len - op->lexer.offset; + int rc; + struct json_token token; + while ((rc = json_lexer_next_token(&op->lexer, &token)) == 0 && + token.type != JSON_TOKEN_END) { + + parent = pos; + switch (token.type) { + case JSON_TOKEN_NUM: + rc = tuple_field_go_to_index(&pos, token.num); + break; + case JSON_TOKEN_STR: + rc = tuple_field_go_to_key(&pos, token.str, token.len); + break; + default: + assert(token.type == JSON_TOKEN_ANY); + return update_err_bad_json(op, + op->lexer.symbol_count - 1); + } + if (rc != 0) + return update_err_no_such_field(op); + } + if (rc > 0) + return update_err_bad_json(op, rc); + + field->type = UPDATE_BAR; + field->bar.point = pos; + mp_next(&pos); + field->bar.point_size = pos - field->bar.point; + field->bar.op = op; + field->bar.parent = parent; + return 0; +} + +/** + * Locate the optional field to set by JSON path in @a op->path. + * If found or only a last path part is not found, initialize @a + * field. + * @param op Update operation. + * @param field Field to locate in. + * @param[out] is_found Set if the field was found. + * @param[out] key_len_or_index One parameter for two values, + * depending on where the target point is located: in an + * array or a map. In case of map it is size of a key + * before the found point. It is used to find range of the + * both key and value in '#' operation to drop the pair. + * In case of array it is index of the point to be able to + * check how many fields are left for deletion. + * + * @retval 0 Success. + * @retval -1 Not found non-last path part or invalid JSON. + */ +static inline int +update_bar_locate_opt(struct update_op *op, struct update_field *field, + bool *is_found, int *key_len_or_index) +{ + assert(! update_op_is_term(op)); + int rc; + field->type = UPDATE_BAR; + field->bar.op = op; + field->bar.path = op->lexer.src + op->lexer.offset; + field->bar.path_len = op->lexer.src_len - op->lexer.offset; + const char *pos = field->data; + struct json_token token; + do { + rc = json_lexer_next_token(&op->lexer, &token); + if (rc != 0) + return update_err_bad_json(op, rc); + + switch (token.type) { + case JSON_TOKEN_END: + *is_found = true; + field->bar.point = pos; + mp_next(&pos); + field->bar.point_size = pos - field->bar.point; + return 0; + case JSON_TOKEN_NUM: + field->bar.parent = pos; + *key_len_or_index = token.num; + rc = tuple_field_go_to_index(&pos, token.num); + break; + case JSON_TOKEN_STR: + field->bar.parent = pos; + *key_len_or_index = token.len; + rc = tuple_field_go_to_key(&pos, token.str, token.len); + break; + default: + assert(token.type == JSON_TOKEN_ANY); + return update_err_bad_json(op, + op->lexer.symbol_count - 1); + } + } while (rc == 0); + assert(rc == -1); + struct json_token tmp_token; + rc = json_lexer_next_token(&op->lexer, &tmp_token); + if (rc != 0) + return update_err_bad_json(op, rc); + if (tmp_token.type != JSON_TOKEN_END) + return update_err_no_such_field(op); + + *is_found = false; + if (token.type == JSON_TOKEN_NUM) { + if (mp_typeof(*field->bar.parent) != MP_ARRAY) { + return update_err(op, "can not access by index a "\ + "non-array field"); + } + const char *tmp = field->bar.parent; + uint32_t size = mp_decode_array(&tmp); + if ((uint32_t) token.num > size) + return update_err_no_such_field(op); + /* + * The only way not to find an element in an array + * by an index is to use array size as the index. + */ + assert((uint32_t) token.num == size); + if (field->bar.parent == field->data) { + field->bar.point = field->data + field->size; + } else { + field->bar.point = field->bar.parent; + mp_next(&field->bar.point); + } + } else { + assert(token.type == JSON_TOKEN_STR); + field->bar.new_key = token.str; + field->bar.new_key_len = token.len; + if (mp_typeof(*field->bar.parent) != MP_MAP) { + return update_err(op, "can not access by key a "\ + "non-map field"); + } + } + return 0; +} + +int +do_op_nop_insert(struct update_op *op, struct update_field *field) +{ + assert(op->opcode == '!'); + assert(field->type == UPDATE_NOP); + bool is_found = false; + int key_len = 0; + if (update_bar_locate_opt(op, field, &is_found, &key_len) != 0) + return -1; + op->new_field_len = op->arg.set.length; + if (mp_typeof(*field->bar.parent) == MP_MAP) { + if (is_found) + return update_err_duplicate(op); + op->new_field_len += mp_sizeof_str(key_len); + } + return 0; +} + +int +do_op_nop_set(struct update_op *op, struct update_field *field) +{ + assert(op->opcode == '='); + assert(field->type == UPDATE_NOP); + bool is_found = false; + int key_len = 0; + if (update_bar_locate_opt(op, field, &is_found, &key_len) != 0) + return -1; + op->new_field_len = op->arg.set.length; + if (! is_found) { + op->opcode = '!'; + if (mp_typeof(*field->bar.parent) == MP_MAP) + op->new_field_len += mp_sizeof_str(key_len); + } + return 0; +} + +int +do_op_nop_delete(struct update_op *op, struct update_field *field) +{ + assert(op->opcode == '#'); + assert(field->type == UPDATE_NOP); + bool is_found = false; + int key_len_or_index = 0; + if (update_bar_locate_opt(op, field, &is_found, &key_len_or_index) != 0) + return -1; + if (! is_found) + return update_err_no_such_field(op); + if (mp_typeof(*field->bar.parent) == MP_ARRAY) { + const char *tmp = field->bar.parent; + uint32_t size = mp_decode_array(&tmp); + if (key_len_or_index + op->arg.del.count > size) + op->arg.del.count = size - key_len_or_index; + const char *end = field->bar.point + field->bar.point_size; + for (uint32_t i = 1; i < op->arg.del.count; ++i) + mp_next(&end); + field->bar.point_size = end - field->bar.point; + } else { + if (op->arg.del.count != 1) + return update_err_delete1(op); + /* Take key size into account to delete it too. */ + uint32_t key_size = mp_sizeof_str(key_len_or_index); + field->bar.point -= key_size; + field->bar.point_size += key_size; + } + return 0; +} + +#define DO_SCALAR_OP_GENERIC(op_type) \ +int \ +do_op_bar_##op_type(struct update_op *op, struct update_field *field) \ +{ \ + (void) op; \ + (void) field; \ + assert(field->type == UPDATE_BAR); \ + diag_set(ClientError, ER_UNSUPPORTED, "update", \ + "intersected JSON paths"); \ + return -1; \ +} + +DO_SCALAR_OP_GENERIC(insert) + +DO_SCALAR_OP_GENERIC(set) + +DO_SCALAR_OP_GENERIC(delete) + +DO_SCALAR_OP_GENERIC(arith) + +DO_SCALAR_OP_GENERIC(bit) + +DO_SCALAR_OP_GENERIC(splice) + +#undef DO_SCALAR_OP_GENERIC + +#define DO_SCALAR_OP_GENERIC(op_type) \ +int \ +do_op_nop_##op_type(struct update_op *op, struct update_field *field) \ +{ \ + assert(field->type == UPDATE_NOP); \ + if (update_bar_locate(op, field) != 0) \ + return -1; \ + return update_op_do_##op_type(op, field->bar.point); \ +} + +DO_SCALAR_OP_GENERIC(arith) + +DO_SCALAR_OP_GENERIC(bit) + +DO_SCALAR_OP_GENERIC(splice) + +uint32_t +update_bar_sizeof(struct update_field *field) +{ + assert(field->type == UPDATE_BAR); + switch(field->bar.op->opcode) { + case '!': { + const char *parent = field->bar.parent; + uint32_t size = field->size + field->bar.op->new_field_len; + if (mp_typeof(*parent) == MP_ARRAY) { + uint32_t array_size = mp_decode_array(&parent); + return size + mp_sizeof_array(array_size + 1) - + mp_sizeof_array(array_size); + } else { + uint32_t map_size = mp_decode_map(&parent); + return size + mp_sizeof_map(map_size + 1) - + mp_sizeof_map(map_size); + } + } + case '#': { + const char *parent = field->bar.parent; + uint32_t delete_count = field->bar.op->arg.del.count; + uint32_t size = field->size - field->bar.point_size; + if (mp_typeof(*parent) == MP_ARRAY) { + uint32_t array_size = mp_decode_array(&parent); + assert(array_size >= delete_count); + return size - mp_sizeof_array(array_size) + + mp_sizeof_array(array_size - delete_count); + } else { + uint32_t map_size = mp_decode_map(&parent); + assert(delete_count == 1); + return size - mp_sizeof_map(map_size) + + mp_sizeof_map(map_size - 1); + } + } + default: { + return field->size - field->bar.point_size + + field->bar.op->new_field_len; + } + } +} + +uint32_t +update_bar_store(struct update_field *field, char *out, char *out_end) +{ + assert(field->type == UPDATE_BAR); + (void) out_end; + struct update_op *op = field->bar.op; + char *out_saved = out; + switch(op->opcode) { + case '!': { + const char *pos = field->bar.parent; + uint32_t before_parent = pos - field->data; + /* Before parent. */ + memcpy(out, field->data, before_parent); + out += before_parent; + if (mp_typeof(*pos) == MP_ARRAY) { + /* New array header. */ + uint32_t size = mp_decode_array(&pos); + out = mp_encode_array(out, size + 1); + /* Before insertion point. */ + size = field->bar.point - pos; + memcpy(out, pos, size); + out += size; + pos += size; + } else { + /* New map header. */ + uint32_t size = mp_decode_map(&pos); + out = mp_encode_map(out, size + 1); + /* New key. */ + out = mp_encode_str(out, field->bar.new_key, + field->bar.new_key_len); + } + /* New value. */ + memcpy(out, op->arg.set.value, op->arg.set.length); + out += op->arg.set.length; + /* Old values and field tail. */ + uint32_t after_point = field->data + field->size - pos; + memcpy(out, pos, after_point); + out += after_point; + return out - out_saved; + } + case '#': { + const char *pos = field->bar.parent; + uint32_t size, before_parent = pos - field->data; + memcpy(out, field->data, before_parent); + out += before_parent; + if (mp_typeof(*pos) == MP_ARRAY) { + size = mp_decode_array(&pos); + out = mp_encode_array(out, size - op->arg.del.count); + } else { + size = mp_decode_map(&pos); + out = mp_encode_map(out, size - 1); + } + size = field->bar.point - pos; + memcpy(out, pos, size); + out += size; + pos = field->bar.point + field->bar.point_size; + + size = field->data + field->size - pos; + memcpy(out, pos, size); + return out + size - out_saved; + } + default: { + uint32_t before_point = field->bar.point - field->data; + const char *field_end = field->data + field->size; + const char *point_end = + field->bar.point + field->bar.point_size; + uint32_t after_point = field_end - point_end; + + memcpy(out, field->data, before_point); + out += before_point; + op->meta->store_cb(op, field->bar.point, out); + out += op->new_field_len; + memcpy(out, point_end, after_point); + return out + after_point - out_saved; + } + } +} diff --git a/src/box/update/update_field.c b/src/box/update/update_field.c index b4ede54db..6baad02dd 100644 --- a/src/box/update/update_field.c +++ b/src/box/update/update_field.c @@ -42,7 +42,9 @@ static inline const char * update_op_field_str(const struct update_op *op) { - if (op->field_no >= 0) + if (op->lexer.src != NULL) + return tt_sprintf("'%.*s'", op->lexer.src_len, op->lexer.src); + else if (op->field_no >= 0) return tt_sprintf("%d", op->field_no + TUPLE_INDEX_BASE); else return tt_sprintf("%d", op->field_no); @@ -83,8 +85,12 @@ update_err_splice_bound(const struct update_op *op) int update_err_no_such_field(const struct update_op *op) { - diag_set(ClientError, ER_NO_SUCH_FIELD_NO, op->field_no >= 0 ? - TUPLE_INDEX_BASE + op->field_no : op->field_no); + if (op->lexer.src == NULL) { + diag_set(ClientError, ER_NO_SUCH_FIELD_NO, op->field_no + + (op->field_no >= 0 ? TUPLE_INDEX_BASE : 0)); + return -1; + } + diag_set(ClientError, ER_NO_SUCH_FIELD_NAME, update_op_field_str(op)); return -1; } @@ -108,6 +114,8 @@ update_field_sizeof(struct update_field *field) return field->scalar.op->new_field_len; case UPDATE_ARRAY: return update_array_sizeof(field); + case UPDATE_BAR: + return update_bar_sizeof(field); default: unreachable(); } @@ -132,6 +140,8 @@ update_field_store(struct update_field *field, char *out, char *out_end) return size; case UPDATE_ARRAY: return update_array_store(field, out, out_end); + case UPDATE_BAR: + return update_bar_store(field, out, out_end); default: unreachable(); } @@ -616,6 +626,7 @@ update_op_decode(struct update_op *op, int index_base, switch(mp_typeof(**expr)) { case MP_INT: case MP_UINT: { + json_lexer_create(&op->lexer, NULL, 0, 0); if (mp_read_i32(op, expr, &field_no) != 0) return -1; if (field_no - index_base >= 0) { @@ -631,14 +642,35 @@ update_op_decode(struct update_op *op, int index_base, case MP_STR: { const char *path = mp_decode_str(expr, &len); uint32_t field_no, hash = field_name_hash(path, len); + json_lexer_create(&op->lexer, path, len, TUPLE_INDEX_BASE); if (tuple_fieldno_by_name(dict, path, len, hash, &field_no) == 0) { op->field_no = (int32_t) field_no; + op->lexer.offset = len; break; } - diag_set(ClientError, ER_NO_SUCH_FIELD_NAME, - tt_cstr(path, len)); - return -1; + struct json_token token; + int rc = json_lexer_next_token(&op->lexer, &token); + if (rc != 0) + return update_err_bad_json(op, rc); + switch (token.type) { + case JSON_TOKEN_NUM: + op->field_no = token.num; + break; + case JSON_TOKEN_STR: + hash = field_name_hash(token.str, token.len); + if (tuple_fieldno_by_name(dict, token.str, token.len, + hash, &field_no) == 0) { + op->field_no = (int32_t) field_no; + break; + } + FALLTHROUGH; + default: + diag_set(ClientError, ER_NO_SUCH_FIELD_NAME, + tt_cstr(path, len)); + return -1; + } + break; } default: diag_set(ClientError, ER_ILLEGAL_PARAMS, diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h index 8ce3c3e82..d4499eff8 100644 --- a/src/box/update/update_field.h +++ b/src/box/update/update_field.h @@ -33,6 +33,7 @@ #include "trivia/util.h" #include "tt_static.h" #include +#include "json/json.h" #include "bit/int96.h" #include "mp_decimal.h" @@ -180,6 +181,12 @@ struct update_op { uint32_t new_field_len; /** Opcode symbol: = + - / ... */ char opcode; + /** + * Operation target path and its lexer in one. This lexer + * is used when the operation is applied down through the + * update tree. + */ + struct json_lexer lexer; }; /** @@ -196,6 +203,16 @@ int update_op_decode(struct update_op *op, int index_base, struct tuple_dictionary *dict, const char **expr); +/** + * Check if the operation should be applied on the current path + * node. + */ +static inline bool +update_op_is_term(const struct update_op *op) +{ + return json_lexer_is_eof(&op->lexer); +} + /* }}} update_op */ /* {{{ update_field */ @@ -220,6 +237,14 @@ enum update_type { * of individual fields. */ UPDATE_ARRAY, + /** + * Field of this type stores such update, that has + * non-empty JSON path non-intersected with any another + * update. In such optimized case it is possible to do not + * allocate neither fields nor ops nor anything for path + * nodes. And this is the most common case. + */ + UPDATE_BAR, }; /** @@ -254,6 +279,49 @@ struct update_field { struct { struct rope *rope; } array; + /** + * Bar update - by JSON path non-intersected with + * any another update. + */ + struct { + /** Bar update is a single operation. */ + struct update_op *op; + /** + * Always has a non-empty head path + * leading inside this field's data. + */ + const char *path; + int path_len; + /** + * For insertion/deletion to change parent + * header. + */ + const char *parent; + union { + /** + * For scalar op; insertion into + * array; deletion. This is the + * point to delete, change or + * insert after. + */ + struct { + const char *point; + uint32_t point_size; + }; + /* + * For insertion into map. New + * key. On insertion into a map + * there is no strict order as in + * array and no point. The field + * is inserted just right after + * the parent header. + */ + struct { + const char *new_key; + uint32_t new_key_len; + }; + }; + } bar; }; }; @@ -323,6 +391,18 @@ OP_DECL_GENERIC(array) /* }}} update_field.array */ +/* {{{ update_field.bar */ + +OP_DECL_GENERIC(bar) + +/* }}} update_field.bar */ + +/* {{{ update_field.nop */ + +OP_DECL_GENERIC(nop) + +/* }}} update_field.nop */ + #undef OP_DECL_GENERIC /* {{{ Common helpers. */ @@ -342,6 +422,10 @@ do_op_##op_type(struct update_op *op, struct update_field *field) \ switch (field->type) { \ case UPDATE_ARRAY: \ return do_op_array_##op_type(op, field); \ + case UPDATE_NOP: \ + return do_op_nop_##op_type(op, field); \ + case UPDATE_BAR: \ + return do_op_bar_##op_type(op, field); \ default: \ unreachable(); \ } \ @@ -407,6 +491,24 @@ update_err_double(const struct update_op *op) return update_err(op, "double update of the same field"); } +static inline int +update_err_bad_json(const struct update_op *op, int pos) +{ + return update_err(op, tt_sprintf("invalid JSON in position %d", pos)); +} + +static inline int +update_err_delete1(const struct update_op *op) +{ + return update_err(op, "can delete only 1 field from a map in a row"); +} + +static inline int +update_err_duplicate(const struct update_op *op) +{ + return update_err(op, "the key exists already"); +} + /** }}} Error helpers. */ #endif /* TARANTOOL_BOX_TUPLE_UPDATE_FIELD_H */ diff --git a/src/box/vinyl.c b/src/box/vinyl.c index 7455c2c86..7fade332b 100644 --- a/src/box/vinyl.c +++ b/src/box/vinyl.c @@ -1997,15 +1997,28 @@ request_normalize_ops(struct request *request) ops_end = mp_encode_str(ops_end, op_name, op_name_len); int field_no; - if (mp_typeof(*pos) == MP_INT) { + const char *field_name; + switch (mp_typeof(*pos)) { + case MP_INT: field_no = mp_decode_int(&pos); ops_end = mp_encode_int(ops_end, field_no); - } else { + break; + case MP_UINT: field_no = mp_decode_uint(&pos); field_no -= request->index_base; ops_end = mp_encode_uint(ops_end, field_no); + break; + case MP_STR: + field_name = pos; + mp_next(&pos); + memcpy(ops_end, field_name, pos - field_name); + ops_end += pos - field_name; + break; + default: + unreachable(); } + if (*op_name == ':') { /** * splice op adjust string pos and copy diff --git a/test/box/update.result b/test/box/update.result index 6c7bf09df..38b8bc13e 100644 --- a/test/box/update.result +++ b/test/box/update.result @@ -834,7 +834,7 @@ s:update({0}, {{'+', 0}}) ... s:update({0}, {{'+', '+', '+'}}) --- -- error: Field '+' was not found in the tuple +- error: 'Field ''+'' UPDATE error: invalid JSON in position 1' ... s:update({0}, {{0, 0, 0}}) --- @@ -889,3 +889,411 @@ s:update(1, {{'=', 3, map}}) s:drop() --- ... +-- +-- gh-1261: update by JSON path. +-- +format = {} +--- +... +format[1] = {'field1', 'unsigned'} +--- +... +format[2] = {'f', 'map'} +--- +... +format[3] = {'g', 'array'} +--- +... +s = box.schema.create_space('test', {format = format}) +--- +... +pk = s:create_index('pk') +--- +... +t = {} +--- +... +t[1] = 1 +--- +... +t[2] = { \ + a = 100, \ + b = 200, \ + c = { \ + d = 400, \ + e = 500, \ + f = {4, 5, 6, 7, 8}, \ + g = {k = 600, l = 700} \ + }, \ + m = true, \ + g = {800, 900} \ +}; \ +t[3] = { \ + 100, \ + 200, \ + { \ + {300, 350}, \ + {400, 450} \ + }, \ + {a = 500, b = 600}, \ + {c = 700, d = 800} \ +} +--- +... +t = s:insert(t) +--- +... +t4_array = t:update({{'!', 4, setmetatable({}, {__serialize = 'array'})}}) +--- +... +t4_map = t:update({{'!', 4, setmetatable({}, {__serialize = 'map'})}}) +--- +... +t +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +-- +-- At first, test simple non-intersected paths. +-- +-- +-- ! +-- +t:update({{'!', 'f.c.f[1]', 3}, {'!', '[3][1]', {100, 200, 300}}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [3, 4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [[100, 200, 300], 100, 200, [ + [300, 350], [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'!', 'f.g[3]', 1000}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900, 1000]}, [100, 200, [[300, 350], + [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'!', 'g[6]', 'new element'}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}, 'new element']] +... +t:update({{'!', 'f.e', 300}, {'!', 'g[4].c', 700}}) +--- +- [1, {'b': 200, 'm': true, 'g': [800, 900], 'a': 100, 'c': {'d': 400, 'f': [4, 5, + 6, 7, 8], 'e': 500, 'g': {'k': 600, 'l': 700}}, 'e': 300}, [100, 200, [[300, + 350], [400, 450]], {'b': 600, 'c': 700, 'a': 500}, {'c': 700, 'd': 800}]] +... +t:update({{'!', 'f.c.f[2]', 4.5}, {'!', 'g[3][2][2]', 425}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 4.5, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 425, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t2 = t:update({{'!', 'g[6]', {100}}}) +--- +... +-- Test single element array update. +t2:update({{'!', 'g[6][2]', 200}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}, [100, 200]]] +... +t2:update({{'!', 'g[6][1]', 50}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}, [50, 100]]] +... +-- Test empty array/map. +t4_array:update({{'!', '[4][1]', 100}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], [100]] +... +t4_map:update({{'!', '[4].a', 100}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], {'a': 100}] +... +-- Test errors. +t:update({{'!', 'a', 100}}) -- No such field. +--- +- error: Field 'a' was not found in the tuple +... +t:update({{'!', 'f.a', 300}}) -- Key already exists. +--- +- error: 'Field ''f.a'' UPDATE error: the key exists already' +... +t:update({{'!', 'f.c.f[0]', 3.5}}) -- No such index, too small. +--- +- error: 'Field ''f.c.f[0]'' UPDATE error: invalid JSON in position 7' +... +t:update({{'!', 'f.c.f[100]', 100}}) -- No such index, too big. +--- +- error: Field ''f.c.f[100]'' was not found in the tuple +... +t:update({{'!', 'g[4][100]', 700}}) -- Insert index into map. +--- +- error: 'Field ''g[4][100]'' UPDATE error: can not access by index a non-array field' +... +t:update({{'!', 'g[1][1]', 300}}) +--- +- error: 'Field ''g[1][1]'' UPDATE error: can not access by index a non-array field' +... +t:update({{'!', 'f.g.a', 700}}) -- Insert key into array. +--- +- error: 'Field ''f.g.a'' UPDATE error: can not access by key a non-map field' +... +t:update({{'!', 'f.g[1].a', 700}}) +--- +- error: 'Field ''f.g[1].a'' UPDATE error: can not access by key a non-map field' +... +t:update({{'!', 'f[*].k', 20}}) -- 'Any' is not considered valid JSON. +--- +- error: 'Field ''f[*].k'' UPDATE error: invalid JSON in position 3' +... +-- JSON error after the not existing field to insert. +t:update({{'!', '[2].e.100000', 100}}) +--- +- error: 'Field ''[2].e.100000'' UPDATE error: invalid JSON in position 7' +... +-- Correct JSON, but next to last field does not exist. '!' can't +-- create the whole path. +t:update({{'!', '[2].e.f', 100}}) +--- +- error: Field ''[2].e.f'' was not found in the tuple +... +-- +-- = +-- +-- Set existing fields. +t:update({{'=', 'f.a', 150}, {'=', 'g[3][1][2]', 400}}) +--- +- [1, {'b': 200, 'm': true, 'a': 150, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 400], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'=', 'f', {a = 100, b = 200}}}) +--- +- [1, {'a': 100, 'b': 200}, [100, 200, [[300, 350], [400, 450]], {'a': 500, 'b': 600}, + {'c': 700, 'd': 800}]] +... +t:update({{'=', 'g[4].b', 700}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 700}, {'c': 700, 'd': 800}]] +... +-- Insert via set. +t:update({{'=', 'f.e', 300}}) +--- +- [1, {'b': 200, 'm': true, 'g': [800, 900], 'a': 100, 'c': {'d': 400, 'f': [4, 5, + 6, 7, 8], 'e': 500, 'g': {'k': 600, 'l': 700}}, 'e': 300}, [100, 200, [[300, + 350], [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'=', 'f.g[3]', 1000}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900, 1000]}, [100, 200, [[300, 350], + [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'=', 'f.g[1]', 0}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [0, 900]}, [100, 200, [[300, 350], [400, 450]], + {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +-- Test empty array/map. +t4_array:update({{'=', '[4][1]', 100}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], [100]] +... +t4_map:update({{'=', '[4]["a"]', 100}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], {'a': 100}] +... +-- Test errors. +t:update({{'=', 'f.a[1]', 100}}) +--- +- error: 'Field ''f.a[1]'' UPDATE error: can not access by index a non-array field' +... +t:update({{'=', 'f.a.k', 100}}) +--- +- error: 'Field ''f.a.k'' UPDATE error: can not access by key a non-map field' +... +t:update({{'=', 'f.c.f[1]', 100}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [100, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'=', 'f.c.f[100]', 100}}) +--- +- error: Field ''f.c.f[100]'' was not found in the tuple +... +t:update({{'=', '[2].c.f 1 1 1 1', 100}}) +--- +- error: 'Field ''[2].c.f 1 1 1 1'' UPDATE error: invalid JSON in position 8' +... +-- +-- # +-- +t:update({{'#', '[2].b', 1}}) +--- +- [1, {'a': 100, 'm': true, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, 'g': { + 'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, 450]], + {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'#', 'f.c.f[1]', 1}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'#', 'f.c.f[1]', 2}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [6, 7, 8], 'e': 500, 'g': { + 'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, 450]], + {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'#', 'f.c.f[1]', 100}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [], 'e': 500, 'g': {'k': 600, + 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, 450]], {'a': 500, + 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'#', 'f.c.f[5]', 1}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'#', 'f.c.f[5]', 2}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +-- Test errors. +t:update({{'#', 'f.h', 1}}) +--- +- error: Field ''f.h'' was not found in the tuple +... +t:update({{'#', 'f.c.f[100]', 1}}) +--- +- error: Field ''f.c.f[100]'' was not found in the tuple +... +t:update({{'#', 'f.b', 2}}) +--- +- error: 'Field ''f.b'' UPDATE error: can delete only 1 field from a map in a row' +... +t:update({{'#', 'f.b', 0}}) +--- +- error: 'Field ''f.b'' UPDATE error: cannot delete 0 fields' +... +t:update({{'#', 'f', 0}}) +--- +- error: 'Field ''f'' UPDATE error: cannot delete 0 fields' +... +-- +-- Scalar operations. +-- +t:update({{'+', 'f.a', 50}}) +--- +- [1, {'b': 200, 'm': true, 'a': 150, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'-', 'f.c.f[1]', 0.5}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [3.5, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t:update({{'&', 'f.c.f[2]', 4}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 4, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +t2 = t:update({{'=', 4, {str = 'abcd'}}}) +--- +... +t2:update({{':', '[4].str', 2, 2, 'e'}}) +--- +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], {'str': 'aed'}] +... +-- Test errors. +t:update({{'+', 'g[3]', 50}}) +--- +- error: 'Argument type in operation ''+'' on field ''g[3]'' does not match field + type: expected a number' +... +t:update({{'+', '[2].b.......', 100}}) +--- +- error: 'Field ''[2].b.......'' UPDATE error: invalid JSON in position 7' +... +t:update({{'+', '[2].b.c.d.e', 100}}) +--- +- error: Field ''[2].b.c.d.e'' was not found in the tuple +... +t:update({{'-', '[2][*]', 20}}) +--- +- error: 'Field ''[2][*]'' UPDATE error: invalid JSON in position 5' +... +-- Vinyl normalizes field numbers. It should not touch paths, +-- and they should not affect squashing. +format = {} +--- +... +format[1] = {'field1', 'unsigned'} +--- +... +format[2] = {'field2', 'any'} +--- +... +vy_s = box.schema.create_space('test2', {engine = 'vinyl', format = format}) +--- +... +pk = vy_s:create_index('pk') +--- +... +_ = vy_s:replace(t) +--- +... +box.begin() +--- +... +-- Use a scalar operation, only they can be squashed. +vy_s:upsert({1, 1}, {{'+', 'field2.c.f[1]', 1}}) +--- +... +vy_s:upsert({1, 1}, {{'+', '[3][3][1][1]', 1}}) +--- +... +box.commit() +--- +... +vy_s:select() +--- +- - [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [5, 5, 6, 7, 8], 'e': 500, + 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[301, 350], [400, + 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]] +... +vy_s:drop() +--- +... +s:drop() +--- +... diff --git a/test/box/update.test.lua b/test/box/update.test.lua index ac7698ce9..60e669d27 100644 --- a/test/box/update.test.lua +++ b/test/box/update.test.lua @@ -280,3 +280,148 @@ t:update({{'=', 3, map}}) s:update(1, {{'=', 3, map}}) s:drop() + +-- +-- gh-1261: update by JSON path. +-- +format = {} +format[1] = {'field1', 'unsigned'} +format[2] = {'f', 'map'} +format[3] = {'g', 'array'} +s = box.schema.create_space('test', {format = format}) +pk = s:create_index('pk') +t = {} +t[1] = 1 +t[2] = { \ + a = 100, \ + b = 200, \ + c = { \ + d = 400, \ + e = 500, \ + f = {4, 5, 6, 7, 8}, \ + g = {k = 600, l = 700} \ + }, \ + m = true, \ + g = {800, 900} \ +}; \ +t[3] = { \ + 100, \ + 200, \ + { \ + {300, 350}, \ + {400, 450} \ + }, \ + {a = 500, b = 600}, \ + {c = 700, d = 800} \ +} +t = s:insert(t) + +t4_array = t:update({{'!', 4, setmetatable({}, {__serialize = 'array'})}}) +t4_map = t:update({{'!', 4, setmetatable({}, {__serialize = 'map'})}}) + +t +-- +-- At first, test simple non-intersected paths. +-- + +-- +-- ! +-- +t:update({{'!', 'f.c.f[1]', 3}, {'!', '[3][1]', {100, 200, 300}}}) +t:update({{'!', 'f.g[3]', 1000}}) +t:update({{'!', 'g[6]', 'new element'}}) +t:update({{'!', 'f.e', 300}, {'!', 'g[4].c', 700}}) +t:update({{'!', 'f.c.f[2]', 4.5}, {'!', 'g[3][2][2]', 425}}) +t2 = t:update({{'!', 'g[6]', {100}}}) +-- Test single element array update. +t2:update({{'!', 'g[6][2]', 200}}) +t2:update({{'!', 'g[6][1]', 50}}) +-- Test empty array/map. +t4_array:update({{'!', '[4][1]', 100}}) +t4_map:update({{'!', '[4].a', 100}}) +-- Test errors. +t:update({{'!', 'a', 100}}) -- No such field. +t:update({{'!', 'f.a', 300}}) -- Key already exists. +t:update({{'!', 'f.c.f[0]', 3.5}}) -- No such index, too small. +t:update({{'!', 'f.c.f[100]', 100}}) -- No such index, too big. +t:update({{'!', 'g[4][100]', 700}}) -- Insert index into map. +t:update({{'!', 'g[1][1]', 300}}) +t:update({{'!', 'f.g.a', 700}}) -- Insert key into array. +t:update({{'!', 'f.g[1].a', 700}}) +t:update({{'!', 'f[*].k', 20}}) -- 'Any' is not considered valid JSON. +-- JSON error after the not existing field to insert. +t:update({{'!', '[2].e.100000', 100}}) +-- Correct JSON, but next to last field does not exist. '!' can't +-- create the whole path. +t:update({{'!', '[2].e.f', 100}}) + +-- +-- = +-- +-- Set existing fields. +t:update({{'=', 'f.a', 150}, {'=', 'g[3][1][2]', 400}}) +t:update({{'=', 'f', {a = 100, b = 200}}}) +t:update({{'=', 'g[4].b', 700}}) +-- Insert via set. +t:update({{'=', 'f.e', 300}}) +t:update({{'=', 'f.g[3]', 1000}}) +t:update({{'=', 'f.g[1]', 0}}) +-- Test empty array/map. +t4_array:update({{'=', '[4][1]', 100}}) +t4_map:update({{'=', '[4]["a"]', 100}}) +-- Test errors. +t:update({{'=', 'f.a[1]', 100}}) +t:update({{'=', 'f.a.k', 100}}) +t:update({{'=', 'f.c.f[1]', 100}}) +t:update({{'=', 'f.c.f[100]', 100}}) +t:update({{'=', '[2].c.f 1 1 1 1', 100}}) + +-- +-- # +-- +t:update({{'#', '[2].b', 1}}) +t:update({{'#', 'f.c.f[1]', 1}}) +t:update({{'#', 'f.c.f[1]', 2}}) +t:update({{'#', 'f.c.f[1]', 100}}) +t:update({{'#', 'f.c.f[5]', 1}}) +t:update({{'#', 'f.c.f[5]', 2}}) +-- Test errors. +t:update({{'#', 'f.h', 1}}) +t:update({{'#', 'f.c.f[100]', 1}}) +t:update({{'#', 'f.b', 2}}) +t:update({{'#', 'f.b', 0}}) +t:update({{'#', 'f', 0}}) + +-- +-- Scalar operations. +-- +t:update({{'+', 'f.a', 50}}) +t:update({{'-', 'f.c.f[1]', 0.5}}) +t:update({{'&', 'f.c.f[2]', 4}}) +t2 = t:update({{'=', 4, {str = 'abcd'}}}) +t2:update({{':', '[4].str', 2, 2, 'e'}}) +-- Test errors. +t:update({{'+', 'g[3]', 50}}) +t:update({{'+', '[2].b.......', 100}}) +t:update({{'+', '[2].b.c.d.e', 100}}) +t:update({{'-', '[2][*]', 20}}) + +-- Vinyl normalizes field numbers. It should not touch paths, +-- and they should not affect squashing. +format = {} +format[1] = {'field1', 'unsigned'} +format[2] = {'field2', 'any'} +vy_s = box.schema.create_space('test2', {engine = 'vinyl', format = format}) +pk = vy_s:create_index('pk') +_ = vy_s:replace(t) + +box.begin() +-- Use a scalar operation, only they can be squashed. +vy_s:upsert({1, 1}, {{'+', 'field2.c.f[1]', 1}}) +vy_s:upsert({1, 1}, {{'+', '[3][3][1][1]', 1}}) +box.commit() + +vy_s:select() +vy_s:drop() + +s:drop() diff --git a/test/engine/update.result b/test/engine/update.result index f181924f3..ddb13bd5b 100644 --- a/test/engine/update.result +++ b/test/engine/update.result @@ -843,11 +843,6 @@ t:update({{'+', '[1]', 50}}) --- - [1, [10, 11, 12], {'b': 21, 'a': 20, 'c': 22}, 'abcdefgh', true, -100, 250] ... --- JSON paths are not allowed yet. -t:update({{'=', 'field2[1]', 13}}) ---- -- error: Field 'field2[1]' was not found in the tuple -... s:update({1}, {{'=', 'field3', {d = 30, e = 31, f = 32}}}) --- - [1, [10, 11, 12], {'d': 30, 'f': 32, 'e': 31}, 'abcdefgh', true, -100, 200] diff --git a/test/engine/update.test.lua b/test/engine/update.test.lua index 4ca2589e4..31fca2b7b 100644 --- a/test/engine/update.test.lua +++ b/test/engine/update.test.lua @@ -156,8 +156,6 @@ t:update({{':', 'field4', 3, 3, 'bbccdd'}, {'+', 'field6', 50}, {'!', 7, 300}}) -- Any path is interpreted as a field name first. And only then -- as JSON. t:update({{'+', '[1]', 50}}) --- JSON paths are not allowed yet. -t:update({{'=', 'field2[1]', 13}}) s:update({1}, {{'=', 'field3', {d = 30, e = 31, f = 32}}}) diff --git a/test/unit/column_mask.c b/test/unit/column_mask.c index 5ee8b7332..38ec34f8f 100644 --- a/test/unit/column_mask.c +++ b/test/unit/column_mask.c @@ -225,16 +225,87 @@ basic_test() column_masks[i]); } +static void +test_paths(void) +{ + header(); + plan(2); + + char buffer1[1024]; + char *pos1 = mp_encode_array(buffer1, 7); + + pos1 = mp_encode_uint(pos1, 1); + pos1 = mp_encode_uint(pos1, 2); + pos1 = mp_encode_array(pos1, 2); + pos1 = mp_encode_uint(pos1, 3); + pos1 = mp_encode_uint(pos1, 4); + pos1 = mp_encode_uint(pos1, 5); + pos1 = mp_encode_array(pos1, 2); + pos1 = mp_encode_uint(pos1, 6); + pos1 = mp_encode_uint(pos1, 7); + pos1 = mp_encode_uint(pos1, 8); + pos1 = mp_encode_uint(pos1, 9); + + + char buffer2[1024]; + char *pos2 = mp_encode_array(buffer2, 2); + + pos2 = mp_encode_array(pos2, 3); + pos2 = mp_encode_str(pos2, "!", 1); + pos2 = mp_encode_str(pos2, "[3][1]", 6); + pos2 = mp_encode_double(pos2, 2.5); + + pos2 = mp_encode_array(pos2, 3); + pos2 = mp_encode_str(pos2, "#", 1); + pos2 = mp_encode_str(pos2, "[5][1]", 6); + pos2 = mp_encode_uint(pos2, 1); + + struct region *gc = &fiber()->gc; + size_t svp = region_used(gc); + uint32_t result_size; + uint64_t column_mask; + const char *result = + tuple_update_execute(buffer2, pos2, buffer1, pos1, + box_tuple_format_default()->dict, + &result_size, 1, &column_mask); + isnt(result, NULL, "JSON update works"); + + /* + * Updates on their first level change fields [3] and [5], + * or 2 and 4 if 0-based. If that was the single level, + * the operations '!' and '#' would change the all the + * fields from 2. But each of these operations are not for + * the root and therefore does not affect anything except + * [3] and [5] on the first level. + */ + uint64_t expected_mask = 0; + column_mask_set_fieldno(&expected_mask, 2); + column_mask_set_fieldno(&expected_mask, 4); + is(column_mask, expected_mask, "column mask match"); + + region_truncate(gc, svp); + + check_plan(); + footer(); +} + +static uint32_t +simple_hash(const char* str, uint32_t len) +{ + return str[0] + len; +} + int main() { memory_init(); fiber_init(fiber_c_invoke); - tuple_init(NULL); + tuple_init(simple_hash); header(); - plan(27); + plan(28); basic_test(); + test_paths(); footer(); check_plan(); diff --git a/test/unit/column_mask.result b/test/unit/column_mask.result index 9309e6cdc..1d87a2f24 100644 --- a/test/unit/column_mask.result +++ b/test/unit/column_mask.result @@ -1,5 +1,5 @@ *** main *** -1..27 +1..28 ok 1 - check result length ok 2 - tuple update is correct ok 3 - column_mask is correct @@ -27,4 +27,10 @@ ok 24 - column_mask is correct ok 25 - check result length ok 26 - tuple update is correct ok 27 - column_mask is correct + *** test_paths *** + 1..2 + ok 1 - JSON update works + ok 2 - column mask match +ok 28 - subtests + *** test_paths: done *** *** main: done *** -- 2.20.1 (Apple Git-117)