From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp48.i.mail.ru (smtp48.i.mail.ru [94.100.177.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4CD3E46971A for ; Mon, 9 Dec 2019 23:18:43 +0300 (MSK) Date: Mon, 9 Dec 2019 23:18:42 +0300 From: Sergey Ostanevich Message-ID: <20191209201842.GA14366@tarantool.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH 1/1] tuple: enable isolated JSON updates List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy Cc: tarantool-patches@dev.tarantool.org Hi! Thanks for the patch! Please, see 4 comments below. Regards, Sergos > diff --git a/src/box/vinyl.c b/src/box/vinyl.c > index 767e40006..15a136f81 100644 > --- a/src/box/vinyl.c > +++ b/src/box/vinyl.c > @@ -2004,13 +2004,25 @@ request_normalize_ops(struct request *request) > ops_end = mp_encode_str(ops_end, op_name, op_name_len); > > int field_no; > - if (mp_typeof(*pos) == MP_INT) { > + const char *field_name; > + switch (mp_typeof(*pos)) { > + case MP_INT: > field_no = mp_decode_int(&pos); > ops_end = mp_encode_int(ops_end, field_no); > - } else { > + break; > + case MP_UINT: > field_no = mp_decode_uint(&pos); > field_no -= request->index_base; > ops_end = mp_encode_uint(ops_end, field_no); > + break; > + case MP_STR: > + field_name = pos; > + mp_next(&pos); > + memcpy(ops_end, field_name, pos - field_name); > + ops_end += pos - field_name; > + break; Are you intentionally skip encode/decode here? Is it safe enough? > diff --git a/src/box/xrow_update.c b/src/box/xrow_update.c > index 123db081a..ecda17544 100644 > --- a/src/box/xrow_update.c > +++ b/src/box/xrow_update.c > @@ -412,6 +434,15 @@ xrow_upsert_squash(const char *expr1, const char *expr1_end, > if (op->opcode != '+' && op->opcode != '-' && > op->opcode != '=') > return NULL; > + /* > + * Not terminal operation means, that the > + * update is not flat, and squash would > + * need to build a tree of operations to > + * find matches. That is too complex, > + * squash is skipped. > + */ > + if (! xrow_update_op_is_term(op)) > + return NULL; Please, no space after unary op - and further on in the patch - according to $3.1 of https://www.tarantool.io/en/doc/2.2/dev_guide/c_style_guide/ > diff --git a/src/box/xrow_update_bar.c b/src/box/xrow_update_bar.c > new file mode 100644 > index 000000000..737673e8a > --- /dev/null > +++ b/src/box/xrow_update_bar.c > @@ -0,0 +1,454 @@ > +/* > + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file. > + * > + * Redistribution and use in source and binary forms, with or > + * without modification, are permitted provided that the following > + * conditions are met: > + * > + * 1. Redistributions of source code must retain the above > + * copyright notice, this list of conditions and the > + * following disclaimer. > + * > + * 2. Redistributions in binary form must reproduce the above > + * copyright notice, this list of conditions and the following > + * disclaimer in the documentation and/or other materials > + * provided with the distribution. > + * > + * THIS SOFTWARE IS PROVIDED BY ``AS IS'' AND > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED > + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL > + * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, > + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF > + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR > + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF > + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF > + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > + * SUCH DAMAGE. > + */ > +#include "xrow_update_field.h" > +#include "tuple.h" > + > +/** > + * Locate a field to update by @a op's JSON path and initialize > + * @a field as a bar update. > + * > + * @param op Update operation. > + * @param field Field to locate in. > + * @param[out] key_len_or_index One parameter for two values, > + * depending on where the target point is located: in an > + * array or a map. In case of map it is size of a key > + * before the found point. It is used to find range of the > + * both key and value in '#' operation to drop the pair. > + * In case of array it is index of the array element to be > + * able to check how many fields are left for deletion. > + * > + * @retval 0 Success. > + * @retval -1 Not found or invalid JSON. > + */ > +static inline int > +xrow_update_bar_locate(struct xrow_update_op *op, > + struct xrow_update_field *field, > + int *key_len_or_index) > +{ > + /* > + * Bar update is not flat by definition. It always has a > + * non empty path. This is why op is expected to be not > + * terminal. > + */ > + assert(! xrow_update_op_is_term(op)); > + int rc; > + field->type = XUPDATE_BAR; > + field->bar.op = op; > + field->bar.path = op->lexer.src + op->lexer.offset; > + field->bar.path_len = op->lexer.src_len - op->lexer.offset; > + const char *pos = field->data; > + struct json_token token; > + while ((rc = json_lexer_next_token(&op->lexer, &token)) == 0 && > + token.type != JSON_TOKEN_END) { > + > + switch (token.type) { > + case JSON_TOKEN_NUM: > + field->bar.parent = pos; > + *key_len_or_index = token.num; > + rc = tuple_field_go_to_index(&pos, token.num); > + break; > + case JSON_TOKEN_STR: > + field->bar.parent = pos; > + *key_len_or_index = token.len; > + rc = tuple_field_go_to_key(&pos, token.str, token.len); > + break; > + default: > + assert(token.type == JSON_TOKEN_ANY); > + rc = op->lexer.symbol_count - 1; > + return xrow_update_err_bad_json(op, rc); > + } > + if (rc != 0) > + return xrow_update_err_no_such_field(op); > + } > + if (rc > 0) > + return xrow_update_err_bad_json(op, rc); > + > + field->bar.point = pos; > + mp_next(&pos); > + field->bar.point_size = pos - field->bar.point; > + return 0; > +} > + > +/** > + * Locate an optional field to update by @a op's JSON path. If > + * found or only a last path part is not found, initialize @a > + * field as a bar update. Last path part may not exist and it is > + * ok, for example, for '!' and '=' operations. > + */ > +static inline int > +xrow_update_bar_locate_opt(struct xrow_update_op *op, > + struct xrow_update_field *field, bool *is_found, > + int *key_len_or_index) > +{ > + /* > + * Bar update is not flat by definition. It always has a > + * non empty path. This is why op is expected to be not > + * terminal. > + */ > + assert(! xrow_update_op_is_term(op)); > + int rc; > + field->type = XUPDATE_BAR; > + field->bar.op = op; > + field->bar.path = op->lexer.src + op->lexer.offset; > + field->bar.path_len = op->lexer.src_len - op->lexer.offset; > + const char *pos = field->data; > + struct json_token token; > + do { > + rc = json_lexer_next_token(&op->lexer, &token); > + if (rc != 0) > + return xrow_update_err_bad_json(op, rc); > + > + switch (token.type) { > + case JSON_TOKEN_END: > + *is_found = true; > + field->bar.point = pos; > + mp_next(&pos); > + field->bar.point_size = pos - field->bar.point; > + return 0; > + case JSON_TOKEN_NUM: > + field->bar.parent = pos; > + *key_len_or_index = token.num; > + rc = tuple_field_go_to_index(&pos, token.num); > + break; > + case JSON_TOKEN_STR: > + field->bar.parent = pos; > + *key_len_or_index = token.len; > + rc = tuple_field_go_to_key(&pos, token.str, token.len); > + break; > + default: > + assert(token.type == JSON_TOKEN_ANY); > + rc = op->lexer.symbol_count - 1; > + return xrow_update_err_bad_json(op, rc); > + } > + } while (rc == 0); > + assert(rc == -1); > + /* Ensure, that 'token' is next to last path part. */ > + struct json_token tmp_token; > + rc = json_lexer_next_token(&op->lexer, &tmp_token); > + if (rc != 0) > + return xrow_update_err_bad_json(op, rc); > + if (tmp_token.type != JSON_TOKEN_END) > + return xrow_update_err_no_such_field(op); > + > + *is_found = false; > + if (token.type == JSON_TOKEN_NUM) { > + const char *tmp = field->bar.parent; > + if (mp_typeof(*tmp) != MP_ARRAY) { > + return xrow_update_err(op, "can not access by index a "\ > + "non-array field"); > + } > + uint32_t size = mp_decode_array(&tmp); > + if ((uint32_t) token.num > size) > + return xrow_update_err_no_such_field(op); IMHO there will be more informative to mention the array and an index that is out of bounds. > + /* > + * The updated point is in an array, its position > + * was not found, and its index is <= size. The > + * only way how can that happen - the update tries > + * to append a new array element. The following > + * code tries to find the array's end. > + */ > + assert((uint32_t) token.num == size); > + if (field->bar.parent == field->data) { > + /* > + * Optimization for the case when the path > + * is short. So parent of the updated > + * point is the field itself. It allows > + * not to decode anything at all. It is > + * worth doing, since the paths are > + * usually short. > + */ > + field->bar.point = field->data + field->size; > + } else { > + field->bar.point = field->bar.parent; > + mp_next(&field->bar.point); > + } > + field->bar.point_size = 0; > + } else { > + assert(token.type == JSON_TOKEN_STR); > + field->bar.new_key = token.str; > + field->bar.new_key_len = token.len; > + if (mp_typeof(*field->bar.parent) != MP_MAP) { > + return xrow_update_err(op, "can not access by key a "\ > + "non-map field"); > + } > + } > + return 0; > +} > + > +/** > + * Nop fields are those which are not updated. And when they > + * receive an update via one of xrow_update_op_do_nop_* functions, > + * it means, that there is a non terminal path digging inside this > + * not updated field. It turns nop field into a bar field. How > + * exactly - depends on a concrete operation. > + */ > + > +int > +xrow_update_op_do_nop_insert(struct xrow_update_op *op, > + struct xrow_update_field *field) > +{ > + assert(op->opcode == '!'); > + assert(field->type == XUPDATE_NOP); Hmm. Do you expect to handle all possible cases in debug mode? Otherwise here should be some sort of gaceful fail to make prod more stable.