* [tarantool-patches] [PATCH v2 1/8] tuple: expose JSON go_to_key and go_to_index functions
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 2/8] tuple: rework updates to improve code extendibility Vladislav Shpilevoy
` (6 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
They are needed in JSON path update for a case, when a final part
of the path may not exist, and go_to_path returns an error.
That case is '=' and '!' on not existing fields. For example,
in {'=', '[1][2][3]', 20} field [3] can be not existing.
For these cases JSON update will have its own implementation of
go_to_path allowing optional last field.
Needed for #1261
---
src/box/tuple.c | 21 ++-------------------
src/box/tuple.h | 23 +++++++++++++++++++++++
2 files changed, 25 insertions(+), 19 deletions(-)
diff --git a/src/box/tuple.c b/src/box/tuple.c
index bf4ea711d..261505f9a 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -385,15 +385,7 @@ tuple_free(void)
/* {{{ tuple_field_* getters */
-/**
- * Propagate @a field to MessagePack(field)[index].
- * @param[in][out] field Field to propagate.
- * @param index 0-based index to propagate to.
- *
- * @retval 0 Success, the index was found.
- * @retval -1 Not found.
- */
-static inline int
+int
tuple_field_go_to_index(const char **field, uint64_t index)
{
enum mp_type type = mp_typeof(**field);
@@ -428,16 +420,7 @@ tuple_field_go_to_index(const char **field, uint64_t index)
return -1;
}
-/**
- * Propagate @a field to MessagePack(field)[key].
- * @param[in][out] field Field to propagate.
- * @param key Key to propagate to.
- * @param len Length of @a key.
- *
- * @retval 0 Success, the index was found.
- * @retval -1 Not found.
- */
-static inline int
+int
tuple_field_go_to_key(const char **field, const char *key, int len)
{
enum mp_type type = mp_typeof(**field);
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 4c4050ca8..ab0c7a99b 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -551,6 +551,29 @@ int
tuple_go_to_path(const char **data, const char *path, uint32_t path_len,
int multikey_idx);
+/**
+ * Propagate @a field to MessagePack(field)[index].
+ * @param[in][out] field Field to propagate.
+ * @param index 0-based index to propagate to.
+ *
+ * @retval 0 Success, the index was found.
+ * @retval -1 Not found.
+ */
+int
+tuple_field_go_to_index(const char **field, uint64_t index);
+
+/**
+ * Propagate @a field to MessagePack(field)[key].
+ * @param[in][out] field Field to propagate.
+ * @param key Key to propagate to.
+ * @param len Length of @a key.
+ *
+ * @retval 0 Success, the index was found.
+ * @retval -1 Not found.
+ */
+int
+tuple_field_go_to_key(const char **field, const char *key, int len);
+
/**
* Get tuple field by field index, relative JSON path and
* multikey_idx.
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v2 2/8] tuple: rework updates to improve code extendibility
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 1/8] tuple: expose JSON go_to_key and go_to_index functions Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
[not found] ` <20190903192059.GE15611@atlas>
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 3/8] json: lexer_eof and token_cmp helper functions Vladislav Shpilevoy
` (5 subsequent siblings)
7 siblings, 1 reply; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Before the patch update was implemented as a set of operations
applicable for arrays only. It was ok until field names and JSON
paths appearance, because tuple is array on the top level.
But now there are four reasons to allow more complex updates of
tuple field internals via JSON paths:
* tuple field access by JSON path is allowed so for consistency
JSON paths should be allowed in updates as well;
* JSON indexes are supported. JSON update should be able to
change an indexed field without rewriting half of a tuple;
* Tarantool is going to support documents in storage so JSON
path updates is one more step forward;
* JSON updates are going to be faster than get + in-memory
Lua/connector update + replace (or update of a whole tuple
field).
The patch prepares current update code to be extended by updates
of map, so named 'bar' updates, and multilevel updates.
The concept is to build a tree of update objects. Each updates a
scalar terminal value (leaf of that tree), or a complex object
like array or map. In the latter case these objects contain >= 2
children updates.
This organization allows to split update implementation into
several independent files-modules for each update type. Next
commits will introduce them one by one.
Needed for #1261
---
src/box/CMakeLists.txt | 2 +
src/box/errcode.h | 10 +-
src/box/tuple_update.c | 1125 ++-------------------------------
src/box/update/update_array.c | 288 +++++++++
src/box/update/update_field.c | 649 +++++++++++++++++++
src/box/update/update_field.h | 410 ++++++++++++
test/box/misc.result | 2 +-
test/box/tuple.result | 4 +-
8 files changed, 1402 insertions(+), 1088 deletions(-)
create mode 100644 src/box/update/update_array.c
create mode 100644 src/box/update/update_field.c
create mode 100644 src/box/update/update_field.h
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 9d2fcea4b..62737aafe 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -41,6 +41,8 @@ add_library(tuple STATIC
field_map.c
tuple_format.c
tuple_update.c
+ update/update_field.c
+ update/update_array.c
tuple_compare.cc
tuple_extract_key.cc
tuple_hash.cc
diff --git a/src/box/errcode.h b/src/box/errcode.h
index 0c1310bdf..f8347f6ec 100644
--- a/src/box/errcode.h
+++ b/src/box/errcode.h
@@ -77,11 +77,11 @@ struct errcode_record {
/* 22 */_(ER_TUPLE_NOT_ARRAY, "Tuple/Key must be MsgPack array") \
/* 23 */_(ER_FIELD_TYPE, "Tuple field %s type does not match one required by operation: expected %s") \
/* 24 */_(ER_INDEX_PART_TYPE_MISMATCH, "Field %s has type '%s' in one index, but type '%s' in another") \
- /* 25 */_(ER_SPLICE, "SPLICE error on field %u: %s") \
- /* 26 */_(ER_UPDATE_ARG_TYPE, "Argument type in operation '%c' on field %u does not match field type: expected %s") \
+ /* 25 */_(ER_UPDATE_SPLICE, "SPLICE error on field %s: %s") \
+ /* 26 */_(ER_UPDATE_ARG_TYPE, "Argument type in operation '%c' on field %s does not match field type: expected %s") \
/* 27 */_(ER_FORMAT_MISMATCH_INDEX_PART, "Field %s has type '%s' in space format, but type '%s' in index definition") \
/* 28 */_(ER_UNKNOWN_UPDATE_OP, "Unknown UPDATE operation") \
- /* 29 */_(ER_UPDATE_FIELD, "Field %u UPDATE error: %s") \
+ /* 29 */_(ER_UPDATE_FIELD, "Field %s UPDATE error: %s") \
/* 30 */_(ER_FUNCTION_TX_ACTIVE, "Transaction is active at return from function") \
/* 31 */_(ER_KEY_PART_COUNT, "Invalid key part count (expected [0..%u], got %u)") \
/* 32 */_(ER_PROC_LUA, "%s") \
@@ -147,7 +147,7 @@ struct errcode_record {
/* 92 */_(ER_ROLE_NOT_GRANTED, "User '%s' does not have role '%s'") \
/* 93 */_(ER_MISSING_SNAPSHOT, "Can't find snapshot") \
/* 94 */_(ER_CANT_UPDATE_PRIMARY_KEY, "Attempt to modify a tuple field which is part of index '%s' in space '%s'") \
- /* 95 */_(ER_UPDATE_INTEGER_OVERFLOW, "Integer overflow when performing '%c' operation on field %u") \
+ /* 95 */_(ER_UPDATE_INTEGER_OVERFLOW, "Integer overflow when performing '%c' operation on field %s") \
/* 96 */_(ER_GUEST_USER_PASSWORD, "Setting password for guest user has no effect") \
/* 97 */_(ER_TRANSACTION_CONFLICT, "Transaction has been aborted by conflict") \
/* 98 */_(ER_UNSUPPORTED_PRIV, "Unsupported %s privilege '%s'") \
@@ -212,7 +212,7 @@ struct errcode_record {
/*157 */_(ER_SQL_BIND_TYPE, "Bind value type %s for parameter %s is not supported") \
/*158 */_(ER_SQL_BIND_PARAMETER_MAX, "SQL bind parameter limit reached: %d") \
/*159 */_(ER_SQL_EXECUTE, "Failed to execute SQL statement: %s") \
- /*160 */_(ER_UPDATE_DECIMAL_OVERFLOW, "Decimal overflow when performing operation '%c' on field %u") \
+ /*160 */_(ER_UPDATE_DECIMAL_OVERFLOW, "Decimal overflow when performing operation '%c' on field %s") \
/*161 */_(ER_SQL_BIND_NOT_FOUND, "Parameter %s was not found in the statement") \
/*162 */_(ER_ACTION_MISMATCH, "Field %s contains %s on conflict action, but %s in index parts") \
/*163 */_(ER_VIEW_MISSING_SQL, "Space declared as a view must have SQL statement") \
diff --git a/src/box/tuple_update.c b/src/box/tuple_update.c
index dd1d78997..87946919d 100644
--- a/src/box/tuple_update.c
+++ b/src/box/tuple_update.c
@@ -28,1060 +28,31 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-
#include "tuple_update.h"
-#include <stdbool.h>
-
-#include "say.h"
#include "error.h"
#include "diag.h"
-#include "trivia/util.h"
-#include "third_party/queue.h"
#include <msgpuck/msgpuck.h>
-#include <bit/int96.h>
#include "column_mask.h"
-#include "mp_extension_types.h"
-#include "mp_decimal.h"
#include "fiber.h"
-#include "tuple_dictionary.h"
-#include "tt_static.h"
-
-/** UPDATE request implementation.
- * UPDATE request is represented by a sequence of operations, each
- * working with a single field. There also are operations which
- * add or remove fields. Only one operation on the same field
- * is allowed.
- *
- * Supported field change operations are: SET, ADD, SUBTRACT;
- * bitwise AND, XOR and OR; SPLICE.
- *
- * Supported tuple change operations are: SET, DELETE, INSERT,
- * PUSH and POP.
- * If the number of fields in a tuple is altered by an operation,
- * field index of all following operations is evaluated against the
- * new tuple.
- *
- * Despite the allowed complexity, a typical use case for UPDATE
- * is when the operation count is much less than field count in
- * a tuple.
- *
- * With the common case in mind, UPDATE tries to minimize
- * the amount of unnecessary temporary tuple copies.
- *
- * First, operations are parsed and initialized. Then, the
- * resulting tuple length is calculated. A new tuple is allocated.
- * Finally, operations are applied sequentially, each copying data
- * from the old tuple to the new tuple.
- *
- * With this approach, cost of UPDATE is proportional to O(tuple
- * length) + O(C * log C), where C is the number of operations in
- * the request, and data is copied from the old tuple to the new
- * one only once.
- *
- * As long as INSERT, DELETE, PUSH and POP change the relative
- * field order, an auxiliary data structure is necessary to look
- * up fields in the "old" tuple by field number. Such field
- * index is built on demand, using "rope" data structure.
- *
- * A rope is a binary tree designed to store long strings built
- * from pieces. Each tree node points to a substring of a large
- * string. In our case, each rope node points at a range of
- * fields, initially in the old tuple, and then, as fields are
- * added and deleted by UPDATE, in the "current" tuple.
- * Note, that the tuple itself is not materialized: when
- * operations which affect field count are initialized, the rope
- * is updated to reflect the new field order.
- * In particular, if a field is deleted by an operation,
- * it disappears from the rope and all subsequent operations
- * on this field number instead affect the field following the
- * deleted one.
- */
-
-static inline void *
-update_alloc(struct region *region, size_t size)
-{
- void *ptr = region_aligned_alloc(region, size, alignof(uint64_t));
- if (ptr == NULL)
- diag_set(OutOfMemory, size, "region_aligned_alloc",
- "update internals");
- return ptr;
-}
+#include "update/update_field.h"
-/** Update internal state */
-struct tuple_update
-{
- struct region *region;
- struct rope *rope;
+/** Tuple update operations and arguments. */
+struct tuple_update {
+ /** Operations array. */
struct update_op *ops;
+ /** Length of ops. */
uint32_t op_count;
- int index_base; /* 0 for C and 1 for Lua */
- /** A bitmask of all columns modified by this update */
- uint64_t column_mask;
-};
-
-/** Argument of SET (and INSERT) operation. */
-struct op_set_arg {
- uint32_t length;
- const char *value;
-};
-
-/** Argument of DELETE operation. */
-struct op_del_arg {
- uint32_t count;
-};
-
-/**
- * MsgPack format code of an arithmetic argument or result.
- * MsgPack codes are not used to simplify type calculation.
- */
-enum arith_type {
- AT_DECIMAL = 0, /* MP_EXT + MP_DECIMAL */
- AT_DOUBLE = 1, /* MP_DOUBLE */
- AT_FLOAT = 2, /* MP_FLOAT */
- AT_INT = 3 /* MP_INT/MP_UINT */
-};
-
-/**
- * Argument (left and right) and result of ADD, SUBTRACT.
- *
- * To perform an arithmetic operation, update first loads
- * left and right arguments into corresponding value objects,
- * then performs arithmetics on types of arguments, thus
- * calculating the type of the result, and then
- * performs the requested operation according to the calculated
- * type rules.
- *
- * The rules are as follows:
- * - when one of the argument types is double, the result is
- * double
- * - when one of the argument types is float, the result is
- * float
- * - for integer arguments, the result type code depends on
- * the range in which falls the result of the operation.
- * If the result is in negative range, it's MP_INT, otherwise
- * it's MP_UINT. If the result is out of bounds of (-2^63,
- * 2^64), and exception is raised for overflow.
- */
-struct op_arith_arg {
- enum arith_type type;
- union {
- double dbl;
- float flt;
- struct int96_num int96;
- decimal_t dec;
- };
-};
-
-/** Argument of AND, XOR, OR operations. */
-struct op_bit_arg {
- uint64_t val;
-};
-
-/** Argument of SPLICE. */
-struct op_splice_arg {
- int32_t offset; /** splice position */
- int32_t cut_length; /** cut this many bytes. */
- const char *paste; /** paste what? */
- uint32_t paste_length; /** paste this many bytes. */
-
- /** Offset of the tail in the old field */
- int32_t tail_offset;
- /** Size of the tail. */
- int32_t tail_length;
-};
-
-union update_op_arg {
- struct op_set_arg set;
- struct op_del_arg del;
- struct op_arith_arg arith;
- struct op_bit_arg bit;
- struct op_splice_arg splice;
-};
-
-struct update_field;
-struct update_op;
-
-static struct update_field *
-update_field_split(struct region *region, struct update_field *data,
- size_t size, size_t offset);
-
-#define ROPE_SPLIT_F update_field_split
-#define ROPE_ALLOC_F update_alloc
-#define rope_data_t struct update_field *
-#define rope_ctx_t struct region *
-
-#include "salad/rope.h"
-
-typedef int (*do_op_func)(struct tuple_update *update, struct update_op *op);
-typedef int (*read_arg_func)(int index_base, struct update_op *op,
- const char **expr);
-typedef void (*store_op_func)(union update_op_arg *arg, const char *in,
- char *out);
-
-/** A set of functions and properties to initialize and do an op. */
-struct update_op_meta {
- read_arg_func read_arg;
- do_op_func do_op;
- store_op_func store;
- /* Argument count */
- uint32_t args;
-};
-
-/** A single UPDATE operation. */
-struct update_op {
- const struct update_op_meta *meta;
- union update_op_arg arg;
- /* Subject field no. */
- int32_t field_no;
- uint32_t new_field_len;
- uint8_t opcode;
-};
-
-/**
- * We can have more than one operation on the same field.
- * A descriptor of one changed field.
- */
-struct update_field {
- /** UPDATE operation against the first field in the range. */
- struct update_op *op;
- /** Points at start of field *data* in the old tuple. */
- const char *old;
- /** End of the old field. */
- const char *tail;
/**
- * Length of the "tail" in the old tuple from end
- * of old data to the beginning of the field in the
- * next update_field structure.
+ * Index base for MessagePack update operations. If update
+ * is from Lua, then the base is 1. Otherwise 0.
*/
- uint32_t tail_len;
+ int index_base;
+ /** A bitmask of all columns modified by this update. */
+ uint64_t column_mask;
+ /** First level of update tree. It is always array. */
+ struct update_field root_array;
};
-static void
-update_field_init(struct update_field *field,
- const char *old, uint32_t old_len, uint32_t tail_len)
-{
- field->op = NULL;
- field->old = old;
- field->tail = old + old_len;
- field->tail_len = tail_len;
-}
-
-/* {{{ read_arg helpers */
-
-/** Read a field index or any other integer field. */
-static inline int
-mp_read_i32(int index_base, struct update_op *op,
- const char **expr, int32_t *ret)
-{
- if (mp_read_int32(expr, ret) == 0)
- return 0;
- diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char)op->opcode,
- index_base + op->field_no, "an integer");
- return -1;
-}
-
-static inline int
-mp_read_uint(int index_base, struct update_op *op,
- const char **expr, uint64_t *ret)
-{
- if (mp_typeof(**expr) == MP_UINT) {
- *ret = mp_decode_uint(expr);
- return 0;
- } else {
- diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char)op->opcode,
- index_base + op->field_no, "a positive integer");
- return -1;
- }
-}
-
-/**
- * Load an argument of an arithmetic operation either from tuple
- * or from the UPDATE command.
- */
-static inline int
-mp_read_arith_arg(int index_base, struct update_op *op,
- const char **expr, struct op_arith_arg *ret)
-{
- if (mp_typeof(**expr) == MP_UINT) {
- ret->type = AT_INT;
- int96_set_unsigned(&ret->int96, mp_decode_uint(expr));
- } else if (mp_typeof(**expr) == MP_INT) {
- ret->type = AT_INT;
- int96_set_signed(&ret->int96, mp_decode_int(expr));
- } else if (mp_typeof(**expr) == MP_DOUBLE) {
- ret->type = AT_DOUBLE;
- ret->dbl = mp_decode_double(expr);
- } else if (mp_typeof(**expr) == MP_FLOAT) {
- ret->type = AT_FLOAT;
- ret->flt = mp_decode_float(expr);
- } else if (mp_typeof(**expr) == MP_EXT) {
- int8_t ext_type;
- uint32_t len = mp_decode_extl(expr, &ext_type);
- switch (ext_type) {
- case MP_DECIMAL:
- ret->type = AT_DECIMAL;
- decimal_unpack(expr, len, &ret->dec);
- break;
- default:
- goto err;
- }
- } else {
-err: diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char)op->opcode,
- index_base + op->field_no, "a number");
- return -1;
- }
- return 0;
-}
-
-static inline int
-mp_read_str(int index_base, struct update_op *op,
- const char **expr, uint32_t *len, const char **ret)
-{
- if (mp_typeof(**expr) != MP_STR) {
- diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char) op->opcode,
- index_base + op->field_no, "a string");
- return -1;
- }
- *ret = mp_decode_str(expr, len); /* value */
- return 0;
-}
-
-/* }}} read_arg helpers */
-
-/* {{{ read_arg */
-
-static int
-read_arg_set(int index_base, struct update_op *op,
- const char **expr)
-{
- (void)index_base;
- op->arg.set.value = *expr;
- mp_next(expr);
- op->arg.set.length = (uint32_t) (*expr - op->arg.set.value);
- return 0;
-}
-
-static int
-read_arg_insert(int index_base, struct update_op *op,
- const char **expr)
-{
- return read_arg_set(index_base, op, expr);
-}
-
-static int
-read_arg_delete(int index_base, struct update_op *op,
- const char **expr)
-{
- if (mp_typeof(**expr) == MP_UINT) {
- op->arg.del.count = (uint32_t) mp_decode_uint(expr);
- return 0;
- } else {
- diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char)op->opcode,
- index_base + op->field_no,
- "a number of fields to delete");
- return -1;
- }
-}
-
-static int
-read_arg_arith(int index_base, struct update_op *op,
- const char **expr)
-{
- return mp_read_arith_arg(index_base, op, expr, &op->arg.arith);
-}
-
-static int
-read_arg_bit(int index_base, struct update_op *op,
- const char **expr)
-{
- struct op_bit_arg *arg = &op->arg.bit;
- return mp_read_uint(index_base, op, expr, &arg->val);
-}
-
-static int
-read_arg_splice(int index_base, struct update_op *op,
- const char **expr)
-{
- struct op_splice_arg *arg = &op->arg.splice;
- if (mp_read_i32(index_base, op, expr, &arg->offset))
- return -1;
- /* cut length */
- if (mp_read_i32(index_base, op, expr, &arg->cut_length))
- return -1;
- /* value */
- return mp_read_str(index_base, op, expr, &arg->paste_length,
- &arg->paste);
-}
-
-/* }}} read_arg */
-
-/* {{{ do_op helpers */
-
-static inline int
-op_adjust_field_no(struct tuple_update *update, struct update_op *op,
- int32_t field_max)
-{
- if (op->field_no >= 0) {
- if (op->field_no < field_max)
- return 0;
- diag_set(ClientError, ER_NO_SUCH_FIELD_NO, update->index_base +
- op->field_no);
- return -1;
- } else {
- if (op->field_no + field_max >= 0) {
- op->field_no += field_max;
- return 0;
- }
- diag_set(ClientError, ER_NO_SUCH_FIELD_NO, op->field_no);
- return -1;
- }
-}
-
-static inline double
-cast_arith_arg_to_double(struct op_arith_arg arg)
-{
- if (arg.type == AT_DOUBLE) {
- return arg.dbl;
- } else if (arg.type == AT_FLOAT) {
- return arg.flt;
- } else {
- assert(arg.type == AT_INT);
- if (int96_is_uint64(&arg.int96)) {
- return int96_extract_uint64(&arg.int96);
- } else {
- assert(int96_is_neg_int64(&arg.int96));
- return int96_extract_neg_int64(&arg.int96);
- }
- }
-}
-
-static inline decimal_t *
-cast_arith_arg_to_decimal(struct op_arith_arg arg, decimal_t *dec)
-{
- decimal_t *ret;
- if (arg.type == AT_DECIMAL) {
- *dec = arg.dec;
- return dec;
- } else if (arg.type == AT_DOUBLE) {
- ret = decimal_from_double(dec, arg.dbl);
- } else if (arg.type == AT_FLOAT) {
- ret = decimal_from_double(dec, arg.flt);
- } else {
- assert(arg.type == AT_INT);
- if (int96_is_uint64(&arg.int96)) {
- uint64_t val = int96_extract_uint64(&arg.int96);
- ret = decimal_from_uint64(dec, val);
- } else {
- assert(int96_is_neg_int64(&arg.int96));
- int64_t val = int96_extract_neg_int64(&arg.int96);
- ret = decimal_from_int64(dec, val);
- }
- }
-
- return ret;
-}
-
-/** Return the MsgPack size of an arithmetic operation result. */
-static inline uint32_t
-mp_sizeof_op_arith_arg(struct op_arith_arg arg)
-{
- if (arg.type == AT_INT) {
- if (int96_is_uint64(&arg.int96)) {
- uint64_t val = int96_extract_uint64(&arg.int96);
- return mp_sizeof_uint(val);
- } else {
- int64_t val = int96_extract_neg_int64(&arg.int96);
- return mp_sizeof_int(val);
- }
- } else if (arg.type == AT_DOUBLE) {
- return mp_sizeof_double(arg.dbl);
- } else if (arg.type == AT_FLOAT) {
- return mp_sizeof_float(arg.flt);
- } else {
- assert(arg.type == AT_DECIMAL);
- return mp_sizeof_decimal(&arg.dec);
- }
-}
-
-static inline int
-make_arith_operation(struct op_arith_arg arg1, struct op_arith_arg arg2,
- char opcode, uint32_t err_fieldno, struct op_arith_arg *ret)
-{
- enum arith_type lowest_type = arg1.type;
- if (arg1.type > arg2.type)
- lowest_type = arg2.type;
-
- if (lowest_type == AT_INT) {
- switch(opcode) {
- case '+':
- int96_add(&arg1.int96, &arg2.int96);
- break;
- case '-':
- int96_invert(&arg2.int96);
- int96_add(&arg1.int96, &arg2.int96);
- break;
- default:
- unreachable(); /* checked by update_read_ops */
- break;
- }
- if (!int96_is_uint64(&arg1.int96) &&
- !int96_is_neg_int64(&arg1.int96)) {
- diag_set(ClientError,
- ER_UPDATE_INTEGER_OVERFLOW,
- opcode, err_fieldno);
- return -1;
- }
- *ret = arg1;
- return 0;
- } else if (lowest_type >= AT_DOUBLE) {
- /* At least one of operands is double or float */
- double a = cast_arith_arg_to_double(arg1);
- double b = cast_arith_arg_to_double(arg2);
- double c;
- switch(opcode) {
- case '+': c = a + b; break;
- case '-': c = a - b; break;
- default:
- diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char)opcode,
- err_fieldno, "a positive integer");
- return -1;
- }
- if (lowest_type == AT_DOUBLE) {
- /* result is DOUBLE */
- ret->type = AT_DOUBLE;
- ret->dbl = c;
- } else {
- /* result is FLOAT */
- assert(lowest_type == AT_FLOAT);
- ret->type = AT_FLOAT;
- ret->flt = (float)c;
- }
- } else {
- /* At least one of the operands is decimal. */
- decimal_t a, b, c;
- if (! cast_arith_arg_to_decimal(arg1, &a) ||
- ! cast_arith_arg_to_decimal(arg2, &b)) {
- diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char)opcode,
- err_fieldno, "a number convertible to decimal.");
- return -1;
- }
-
- switch(opcode) {
- case '+':
- if (decimal_add(&c, &a, &b) == NULL) {
- diag_set(ClientError, ER_UPDATE_DECIMAL_OVERFLOW,
- opcode, err_fieldno);
- return -1;
- }
- break;
- case '-':
- if (decimal_sub(&c, &a, &b) == NULL) {
- diag_set(ClientError, ER_UPDATE_DECIMAL_OVERFLOW,
- opcode, err_fieldno);
- return -1;
- }
- break;
- default:
- diag_set(ClientError, ER_UPDATE_ARG_TYPE, (char)opcode,
- err_fieldno);
- return -1;
- }
- ret->type = AT_DECIMAL;
- ret->dec = c;
- }
- return 0;
-}
-
-/* }}} do_op helpers */
-
-/* {{{ do_op */
-
-static int
-do_op_insert(struct tuple_update *update, struct update_op *op)
-{
- if (op_adjust_field_no(update, op, rope_size(update->rope) + 1))
- return -1;
- struct update_field *field = (struct update_field *)
- update_alloc(update->region, sizeof(*field));
- if (field == NULL)
- return -1;
- update_field_init(field, op->arg.set.value, op->arg.set.length, 0);
- return rope_insert(update->rope, op->field_no, field, 1);
-}
-
-static int
-do_op_set(struct tuple_update *update, struct update_op *op)
-{
- /* intepret '=' for n +1 field as insert */
- if (op->field_no == (int32_t) rope_size(update->rope))
- return do_op_insert(update, op);
- if (op_adjust_field_no(update, op, rope_size(update->rope)))
- return -1;
- struct update_field *field = rope_extract(update->rope, op->field_no);
- if (field == NULL)
- return -1;
- /* Ignore the previous op, if any. */
- field->op = op;
- op->new_field_len = op->arg.set.length;
- return 0;
-}
-
-static int
-do_op_delete(struct tuple_update *update, struct update_op *op)
-{
- if (op_adjust_field_no(update, op, rope_size(update->rope)))
- return -1;
- uint32_t delete_count = op->arg.del.count;
-
- if ((uint64_t) op->field_no + delete_count > rope_size(update->rope))
- delete_count = rope_size(update->rope) - op->field_no;
-
- if (delete_count == 0) {
- diag_set(ClientError, ER_UPDATE_FIELD,
- update->index_base + op->field_no,
- "cannot delete 0 fields");
- return -1;
- }
-
- for (uint32_t u = 0; u < delete_count; u++)
- rope_erase(update->rope, op->field_no);
- return 0;
-}
-
-static int
-do_op_arith(struct tuple_update *update, struct update_op *op)
-{
- if (op_adjust_field_no(update, op, rope_size(update->rope)))
- return -1;
-
- struct update_field *field = rope_extract(update->rope, op->field_no);
- if (field == NULL)
- return -1;
- if (field->op) {
- diag_set(ClientError, ER_UPDATE_FIELD,
- update->index_base + op->field_no,
- "double update of the same field");
- return -1;
- }
- const char *old = field->old;
- struct op_arith_arg left_arg;
- if (mp_read_arith_arg(update->index_base, op, &old, &left_arg))
- return -1;
-
- struct op_arith_arg right_arg = op->arg.arith;
- if (make_arith_operation(left_arg, right_arg, op->opcode,
- update->index_base + op->field_no,
- &op->arg.arith))
- return -1;
- field->op = op;
- op->new_field_len = mp_sizeof_op_arith_arg(op->arg.arith);
- return 0;
-}
-
-static int
-do_op_bit(struct tuple_update *update, struct update_op *op)
-{
- if (op_adjust_field_no(update, op, rope_size(update->rope)))
- return -1;
- struct update_field *field = rope_extract(update->rope, op->field_no);
- if (field == NULL)
- return -1;
- struct op_bit_arg *arg = &op->arg.bit;
- if (field->op) {
- diag_set(ClientError, ER_UPDATE_FIELD,
- update->index_base + op->field_no,
- "double update of the same field");
- return -1;
- }
- const char *old = field->old;
- uint64_t val;
- if (mp_read_uint(update->index_base, op, &old, &val))
- return -1;
- switch (op->opcode) {
- case '&':
- arg->val &= val;
- break;
- case '^':
- arg->val ^= val;
- break;
- case '|':
- arg->val |= val;
- break;
- default:
- unreachable(); /* checked by update_read_ops */
- }
- field->op = op;
- op->new_field_len = mp_sizeof_uint(arg->val);
- return 0;
-}
-
-static int
-do_op_splice(struct tuple_update *update, struct update_op *op)
-{
- if (op_adjust_field_no(update, op, rope_size(update->rope)))
- return -1;
- struct update_field *field = rope_extract(update->rope, op->field_no);
- if (field == NULL)
- return -1;
- if (field->op) {
- diag_set(ClientError, ER_UPDATE_FIELD,
- update->index_base + op->field_no,
- "double update of the same field");
- return -1;
- }
-
- struct op_splice_arg *arg = &op->arg.splice;
-
- const char *in = field->old;
- int32_t str_len;
- if (mp_read_str(update->index_base, op, &in, (uint32_t *) &str_len, &in))
- return -1;
-
- if (arg->offset < 0) {
- if (-arg->offset > str_len + 1) {
- diag_set(ClientError, ER_SPLICE,
- update->index_base + op->field_no,
- "offset is out of bound");
- return -1;
- }
- arg->offset = arg->offset + str_len + 1;
- } else if (arg->offset - update->index_base >= 0) {
- arg->offset -= update->index_base;
- if (arg->offset > str_len)
- arg->offset = str_len;
- } else /* (offset <= 0) */ {
- diag_set(ClientError, ER_SPLICE,
- update->index_base + op->field_no,
- "offset is out of bound");
- return -1;
- }
-
- assert(arg->offset >= 0 && arg->offset <= str_len);
-
- if (arg->cut_length < 0) {
- if (-arg->cut_length > (str_len - arg->offset))
- arg->cut_length = 0;
- else
- arg->cut_length += str_len - arg->offset;
- } else if (arg->cut_length > str_len - arg->offset) {
- arg->cut_length = str_len - arg->offset;
- }
-
- assert(arg->offset <= str_len);
-
- /* Fill tail part */
- arg->tail_offset = arg->offset + arg->cut_length;
- arg->tail_length = str_len - arg->tail_offset;
-
-
- field->op = op;
- /* Record the new field length (maximal). */
- op->new_field_len = mp_sizeof_str(arg->offset + arg->paste_length +
- arg->tail_length);
- return 0;
-}
-
-/* }}} do_op */
-
-/* {{{ store_op */
-
-static void
-store_op_set(struct op_set_arg *arg, const char *in, char *out)
-{
- (void)in;
- memcpy(out, arg->value, arg->length);
-}
-
-static void
-store_op_insert(struct op_set_arg *arg, const char *in, char *out)
-{
- (void)in;
- memcpy(out, arg->value, arg->length);
-}
-
-static void
-store_op_arith(struct op_arith_arg *arg, const char *in, char *out)
-{
- (void)in;
- if (arg->type == AT_INT) {
- if (int96_is_uint64(&arg->int96)) {
- mp_encode_uint(out, int96_extract_uint64(&arg->int96));
- } else {
- assert(int96_is_neg_int64(&arg->int96));
- mp_encode_int(out, int96_extract_neg_int64(&arg->int96));
- }
- } else if (arg->type == AT_DOUBLE) {
- mp_encode_double(out, arg->dbl);
- } else if (arg->type == AT_FLOAT) {
- mp_encode_float(out, arg->flt);
- } else {
- assert (arg->type == AT_DECIMAL);
- mp_encode_decimal(out, &arg->dec);
- }
-}
-
-static void
-store_op_bit(struct op_bit_arg *arg, const char *in, char *out)
-{
- (void)in;
- mp_encode_uint(out, arg->val);
-}
-
-static void
-store_op_splice(struct op_splice_arg *arg, const char *in, char *out)
-{
- uint32_t new_str_len = arg->offset + arg->paste_length
- + arg->tail_length;
-
- (void) mp_decode_strl(&in);
-
- out = mp_encode_strl(out, new_str_len);
- memcpy(out, in, arg->offset); /* copy field head. */
- out = out + arg->offset;
- memcpy(out, arg->paste, arg->paste_length); /* copy the paste */
- out = out + arg->paste_length;
- memcpy(out, in + arg->tail_offset, arg->tail_length); /* copy tail */
-}
-
-/* }}} store_op */
-
-static const struct update_op_meta op_set =
- { read_arg_set, do_op_set, (store_op_func) store_op_set, 3 };
-static const struct update_op_meta op_insert =
- { read_arg_insert, do_op_insert, (store_op_func) store_op_insert, 3 };
-static const struct update_op_meta op_arith =
- { read_arg_arith, do_op_arith, (store_op_func) store_op_arith, 3 };
-static const struct update_op_meta op_bit =
- { read_arg_bit, do_op_bit, (store_op_func) store_op_bit, 3 };
-static const struct update_op_meta op_splice =
- { read_arg_splice, do_op_splice, (store_op_func) store_op_splice, 5 };
-static const struct update_op_meta op_delete =
- { read_arg_delete, do_op_delete, (store_op_func) NULL, 3 };
-
-/** Split a range of fields in two, allocating update_field
- * context for the new range.
- */
-static struct update_field *
-update_field_split(struct region *region, struct update_field *prev,
- size_t size, size_t offset)
-{
- (void) size;
- struct update_field *next =
- (struct update_field *) update_alloc(region, sizeof(*next));
- if (next == NULL)
- return NULL;
- assert(offset > 0 && prev->tail_len > 0);
-
- const char *field = prev->tail;
- const char *end = field + prev->tail_len;
-
- for (uint32_t i = 1; i < offset; i++) {
- mp_next(&field);
- }
-
- prev->tail_len = field - prev->tail;
- const char *f = field;
- mp_next(&f);
- uint32_t field_len = f - field;
-
- update_field_init(next, field, field_len, end - field - field_len);
- return next;
-}
-
-/**
- * We found a tuple to do the update on. Prepare a rope
- * to perform operations on.
- * @param update Update meta.
- * @param tuple_data MessagePack array without the array header.
- * @param tuple_data_end End of the @tuple_data.
- * @param field_count Field count in @tuple_data.
- *
- * @retval 0 Success.
- * @retval -1 Error.
- */
-struct rope *
-tuple_rope_new(struct region *region, const char *tuple_data,
- const char *tuple_data_end, uint32_t field_count)
-{
- struct rope *rope = rope_new(region);
- if (rope == NULL)
- return NULL;
- /* Initialize the rope with the old tuple. */
- struct update_field *first =
- (struct update_field *) update_alloc(region, sizeof(*first));
- if (first == NULL)
- return NULL;
- const char *field = tuple_data;
- const char *end = tuple_data_end;
- if (field == end)
- return rope;
-
- /* Add first field to rope */
- mp_next(&tuple_data);
- uint32_t field_len = tuple_data - field;
- update_field_init(first, field, field_len, end - field - field_len);
-
- return rope_append(rope, first, field_count) != 0 ? NULL : rope;
-}
-
-static uint32_t
-update_calc_tuple_length(struct tuple_update *update)
-{
- uint32_t res = mp_sizeof_array(rope_size(update->rope));
- struct rope_iter it;
- struct rope_node *node;
-
- rope_iter_create(&it, update->rope);
- for (node = rope_iter_start(&it); node; node = rope_iter_next(&it)) {
- struct update_field *field = rope_leaf_data(node);
- uint32_t field_len = (field->op ? field->op->new_field_len :
- (uint32_t)(field->tail - field->old));
- res += field_len + field->tail_len;
- }
-
- return res;
-}
-
-static uint32_t
-update_write_tuple(struct tuple_update *update, char *buffer, char *buffer_end)
-{
- char *new_data = buffer;
- new_data = mp_encode_array(new_data, rope_size(update->rope));
-
- (void) buffer_end;
-
- uint32_t total_field_count = 0;
-
- struct rope_iter it;
- struct rope_node *node;
-
- rope_iter_create(&it, update->rope);
- for (node = rope_iter_start(&it); node; node = rope_iter_next(&it)) {
- struct update_field *field = rope_leaf_data(node);
- uint32_t field_count = rope_leaf_size(node);
- const char *old_field = field->old;
- struct update_op *op = field->op;
- if (op) {
- op->meta->store(&op->arg, old_field, new_data);
- new_data += op->new_field_len;
- } else {
- uint32_t field_len = field->tail - field->old;
- memcpy(new_data, old_field, field_len);
- new_data += field_len;
- }
- /* Copy tail_len from the old tuple. */
- assert(field->tail_len == 0 || field_count > 1);
- if (field_count > 1) {
- memcpy(new_data, field->tail, field->tail_len);
- new_data += field->tail_len;
- }
- total_field_count += field_count;
- }
-
- assert(rope_size(update->rope) == total_field_count);
- assert(new_data <= buffer_end);
- return new_data - buffer; /* real_tuple_size */
-}
-
-static const struct update_op_meta *
-update_op_by(char opcode)
-{
- switch (opcode) {
- case '=':
- return &op_set;
- case '+':
- case '-':
- return &op_arith;
- case '&':
- case '|':
- case '^':
- return &op_bit;
- case ':':
- return &op_splice;
- case '#':
- return &op_delete;
- case '!':
- return &op_insert;
- default:
- diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
- return NULL;
- }
-}
-
-/**
- * Decode an update operation from MessagePack.
- * @param[out] op Update operation.
- * @param index_base Field numbers base: 0 or 1.
- * @param dict Dictionary to lookup field number by a name.
- * @param expr MessagePack.
- *
- * @retval 0 Success.
- * @retval -1 Client error.
- */
-static inline int
-update_op_decode(struct update_op *op, int index_base,
- struct tuple_dictionary *dict, const char **expr)
-{
- if (mp_typeof(**expr) != MP_ARRAY) {
- diag_set(ClientError, ER_ILLEGAL_PARAMS, "update operation "
- "must be an array {op,..}");
- return -1;
- }
- uint32_t len, args = mp_decode_array(expr);
- if (args < 1) {
- diag_set(ClientError, ER_ILLEGAL_PARAMS, "update operation "\
- "must be an array {op,..}, got empty array");
- return -1;
- }
- if (mp_typeof(**expr) != MP_STR) {
- diag_set(ClientError, ER_ILLEGAL_PARAMS,
- "update operation name must be a string");
- return -1;
- }
- op->opcode = *mp_decode_str(expr, &len);
- op->meta = update_op_by(op->opcode);
- if (op->meta == NULL)
- return -1;
- if (args != op->meta->args) {
- diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
- return -1;
- }
- int32_t field_no;
- switch(mp_typeof(**expr)) {
- case MP_INT:
- case MP_UINT: {
- if (mp_read_i32(index_base, op, expr, &field_no) != 0)
- return -1;
- if (field_no - index_base >= 0) {
- op->field_no = field_no - index_base;
- } else if (field_no < 0) {
- op->field_no = field_no;
- } else {
- diag_set(ClientError, ER_NO_SUCH_FIELD_NO, field_no);
- return -1;
- }
- break;
- }
- case MP_STR: {
- const char *path = mp_decode_str(expr, &len);
- uint32_t field_no, hash = field_name_hash(path, len);
- if (tuple_fieldno_by_name(dict, path, len, hash,
- &field_no) != 0) {
- diag_set(ClientError, ER_NO_SUCH_FIELD_NAME,
- tt_cstr(path, len));
- return -1;
- }
- op->field_no = (int32_t) field_no;
- break;
- }
- default:
- diag_set(ClientError, ER_ILLEGAL_PARAMS,
- "field id must be a number or a string");
- return -1;
- }
- return op->meta->read_arg(index_base, op, expr);
-}
-
/**
* Read and check update operations and fill column mask.
*
@@ -1112,20 +83,18 @@ update_read_ops(struct tuple_update *update, const char *expr,
return -1;
}
uint64_t column_mask = 0;
- /* number of operations */
update->op_count = mp_decode_array(&expr);
-
if (update->op_count > BOX_UPDATE_OP_CNT_MAX) {
diag_set(ClientError, ER_ILLEGAL_PARAMS,
"too many operations for update");
return -1;
}
-
- /* Read update operations. */
- update->ops = (struct update_op *) update_alloc(update->region,
- update->op_count * sizeof(struct update_op));
- if (update->ops == NULL)
+ int size = update->op_count * sizeof(update->ops[0]);
+ update->ops = (struct update_op *) region_alloc(&fiber()->gc, size);
+ if (update->ops == NULL) {
+ diag_set(OutOfMemory, size, "region_alloc", "update->ops");
return -1;
+ }
struct update_op *op = update->ops;
struct update_op *ops_end = op + update->op_count;
for (; op < ops_end; op++) {
@@ -1227,14 +196,13 @@ static int
update_do_ops(struct tuple_update *update, const char *old_data,
const char *old_data_end, uint32_t part_count)
{
- update->rope = tuple_rope_new(update->region, old_data, old_data_end,
- part_count);
- if (update->rope == NULL)
+ if (update_array_create(&update->root_array, old_data, old_data_end,
+ part_count) != 0)
return -1;
struct update_op *op = update->ops;
struct update_op *ops_end = op + update->op_count;
for (; op < ops_end; op++) {
- if (op->meta->do_op(update, op))
+ if (op->meta->do_cb(op, &update->root_array) != 0)
return -1;
}
return 0;
@@ -1250,14 +218,13 @@ upsert_do_ops(struct tuple_update *update, const char *old_data,
const char *old_data_end, uint32_t part_count,
bool suppress_error)
{
- update->rope = tuple_rope_new(update->region, old_data, old_data_end,
- part_count);
- if (update->rope == NULL)
+ if (update_array_create(&update->root_array, old_data, old_data_end,
+ part_count) != 0)
return -1;
struct update_op *op = update->ops;
struct update_op *ops_end = op + update->op_count;
for (; op < ops_end; op++) {
- if (op->meta->do_op(update, op) == 0)
+ if (op->meta->do_cb(op, &update->root_array) == 0)
continue;
struct error *e = diag_last_error(diag_get());
if (e->type != &type_ClientError)
@@ -1274,23 +241,22 @@ static void
update_init(struct tuple_update *update, int index_base)
{
memset(update, 0, sizeof(*update));
- update->region = &fiber()->gc;
- /*
- * Base field offset, e.g. 0 for C and 1 for Lua. Used only for
- * error messages. All fields numbers must be zero-based!
- */
update->index_base = index_base;
}
-const char *
+static const char *
update_finish(struct tuple_update *update, uint32_t *p_tuple_len)
{
- uint32_t tuple_len = update_calc_tuple_length(update);
- char *buffer = (char *) update_alloc(update->region, tuple_len);
- if (buffer == NULL)
+ uint32_t tuple_size = update_array_sizeof(&update->root_array);
+ char *out = (char *) region_alloc(&fiber()->gc, tuple_size);
+ if (out == NULL) {
+ diag_set(OutOfMemory, tuple_size, "region_alloc", "buffer");
return NULL;
- *p_tuple_len = update_write_tuple(update, buffer, buffer + tuple_len);
- return buffer;
+ }
+ *p_tuple_len = update_array_store(&update->root_array, out,
+ out + tuple_size);
+ assert(*p_tuple_len == tuple_size);
+ return out;
}
int
@@ -1371,10 +337,13 @@ tuple_upsert_squash(const char *expr1, const char *expr1_end,
}
size_t possible_size = expr1_end - expr1 + expr2_end - expr2;
const uint32_t space_for_arr_tag = 5;
- char *buf = (char *) update_alloc(&fiber()->gc,
+ char *buf = (char *) region_alloc(&fiber()->gc,
possible_size + space_for_arr_tag);
- if (buf == NULL)
+ if (buf == NULL) {
+ diag_set(OutOfMemory, possible_size + space_for_arr_tag,
+ "region_alloc", "buf");
return NULL;
+ }
/* reserve some space for mp array header */
char *res_ops = buf + space_for_arr_tag;
uint32_t res_count = 0; /* number of resulting operations */
@@ -1428,20 +397,16 @@ tuple_upsert_squash(const char *expr1, const char *expr1_end,
op[0]->opcode = '+';
int96_invert(&op[0]->arg.arith.int96);
}
- struct op_arith_arg res;
- if (make_arith_operation(op[0]->arg.arith, op[1]->arg.arith,
- op[1]->opcode,
- update[0].index_base +
- op[0]->field_no, &res))
+ struct update_op res;
+ if (make_arith_operation(op[1], op[0]->arg.arith,
+ &res.arg.arith) != 0)
return NULL;
res_ops = mp_encode_array(res_ops, 3);
- res_ops = mp_encode_str(res_ops,
- (const char *)&op[0]->opcode, 1);
- res_ops = mp_encode_uint(res_ops,
- op[0]->field_no +
- update[0].index_base);
+ res_ops = mp_encode_str(res_ops, &op[0]->opcode, 1);
+ res_ops = mp_encode_uint(res_ops, op[0]->field_no +
+ update[0].index_base);
store_op_arith(&res, NULL, res_ops);
- res_ops += mp_sizeof_op_arith_arg(res);
+ res_ops += update_arith_sizeof(&res.arg.arith);
mp_next(&expr[0]);
mp_next(&expr[1]);
op_no[0]++;
diff --git a/src/box/update/update_array.c b/src/box/update/update_array.c
new file mode 100644
index 000000000..985d555af
--- /dev/null
+++ b/src/box/update/update_array.c
@@ -0,0 +1,288 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "update_field.h"
+#include "msgpuck.h"
+#include "fiber.h"
+
+/**
+ * Make field number positive and check for the field existence.
+ * @param op Update operation.
+ * @param field_count Field count in the array having this field.
+ *
+ * @retval 0 Success.
+ * @retval -1 Error.
+ */
+static inline int
+update_op_adjust_field_no(struct update_op *op, int32_t field_count)
+{
+ if (op->field_no >= 0) {
+ if (op->field_no < field_count)
+ return 0;
+ return update_err_no_such_field(op);
+ }
+ if (op->field_no + field_count >= 0) {
+ op->field_no += field_count;
+ return 0;
+ }
+ return update_err_no_such_field(op);
+}
+
+/**
+ * Update of an array consists of array items. Each item is a
+ * range of fields. Item updates first field of the range and
+ * stores size of others to save them with no changes into a new
+ * tuple later. It allows on update of a single field in an array
+ * store at most 2 objects - item for the previous fields, and
+ * item for this one + its tail. This is how rope data structure
+ * works.
+ */
+struct update_array_item {
+ /** Update of the first field in the range. */
+ struct update_field field;
+ /** Size of other fields in the range. */
+ uint32_t tail_size;
+};
+
+/** Initialize an array item. */
+static inline void
+update_array_item_create(struct update_array_item *item, enum update_type type,
+ const char *data, uint32_t size, uint32_t tail_size)
+{
+ item->field.type = type;
+ item->field.data = data;
+ item->field.size = size;
+ item->tail_size = tail_size;
+}
+
+/** Rope allocator for nodes, paths, items etc. */
+static inline void *
+rope_alloc(struct region *region, size_t size)
+{
+ void *ptr = region_aligned_alloc(region, size, alignof(uint64_t));
+ if (ptr == NULL) {
+ diag_set(OutOfMemory, size, "region_aligned_alloc",
+ "update internals");
+ }
+ return ptr;
+}
+
+/** Split a range of fields in two. */
+static struct update_array_item *
+rope_field_split(struct region *region, struct update_array_item *prev,
+ size_t size, size_t offset)
+{
+ (void) size;
+ struct update_array_item *next =
+ (struct update_array_item *) rope_alloc(region, sizeof(*next));
+ if (next == NULL)
+ return NULL;
+ assert(offset > 0 && prev->tail_size > 0);
+
+ const char *tail = prev->field.data + prev->field.size;
+ const char *field = tail;
+ const char *range_end = tail + prev->tail_size;
+
+ for (uint32_t i = 1; i < offset; ++i)
+ mp_next(&field);
+
+ prev->tail_size = field - tail;
+ const char *field_end = field;
+ mp_next(&field_end);
+ update_array_item_create(next, UPDATE_NOP, field, field_end - field,
+ range_end - field_end);
+ return next;
+}
+
+#define ROPE_SPLIT_F rope_field_split
+#define ROPE_ALLOC_F rope_alloc
+#define rope_data_t struct update_array_item *
+#define rope_ctx_t struct region *
+
+#include "salad/rope.h"
+
+/**
+ * Extract from the array an item whose range starts from the
+ * field affected by @a op.
+ */
+static inline struct update_array_item *
+update_array_extract_item(struct update_field *field, struct update_op *op)
+{
+ assert(field->type == UPDATE_ARRAY);
+ struct rope *rope = field->array.rope;
+ if (update_op_adjust_field_no(op, rope_size(rope)) != 0)
+ return NULL;
+ return rope_extract(rope, op->field_no);
+}
+
+int
+update_array_create(struct update_field *field, const char *data,
+ const char *data_end, uint32_t field_count)
+{
+ field->type = UPDATE_ARRAY;
+ field->data = data;
+ field->size = data_end - data;
+ field->array.rope = rope_new(&fiber()->gc);
+ if (field->array.rope == NULL)
+ return -1;
+ struct update_array_item *item = (struct update_array_item *)
+ rope_alloc(&fiber()->gc, sizeof(*item));
+ if (item == NULL)
+ return -1;
+ if (data == data_end)
+ return 0;
+ /*
+ * Initial item consists of one range - the whole array.
+ */
+ const char *begin = data;
+ mp_next(&data);
+ update_array_item_create(item, UPDATE_NOP, begin, data - begin,
+ data_end - data);
+ return rope_append(field->array.rope, item, field_count);
+}
+
+uint32_t
+update_array_sizeof(struct update_field *field)
+{
+ assert(field->type == UPDATE_ARRAY);
+ struct rope_iter it;
+ rope_iter_create(&it, field->array.rope);
+
+ uint32_t res = mp_sizeof_array(rope_size(field->array.rope));
+ for (struct rope_node *node = rope_iter_start(&it); node != NULL;
+ node = rope_iter_next(&it)) {
+ struct update_array_item *item = rope_leaf_data(node);
+ res += update_field_sizeof(&item->field) + item->tail_size;
+ }
+ return res;
+}
+
+uint32_t
+update_array_store(struct update_field *field, char *out, char *out_end)
+{
+ assert(field->type == UPDATE_ARRAY);
+ char *out_begin = out;
+ out = mp_encode_array(out, rope_size(field->array.rope));
+ uint32_t total_field_count = 0;
+ struct rope_iter it;
+ rope_iter_create(&it, field->array.rope);
+ for (struct rope_node *node = rope_iter_start(&it); node != NULL;
+ node = rope_iter_next(&it)) {
+ struct update_array_item *item = rope_leaf_data(node);
+ uint32_t field_count = rope_leaf_size(node);
+ out += update_field_store(&item->field, out, out_end);
+ assert(item->tail_size == 0 || field_count > 1);
+ memcpy(out, item->field.data + item->field.size,
+ item->tail_size);
+ out += item->tail_size;
+ total_field_count += field_count;
+ }
+ (void) total_field_count;
+ assert(rope_size(field->array.rope) == total_field_count);
+ assert(out <= out_end);
+ return out - out_begin;
+}
+
+int
+do_op_array_insert(struct update_op *op, struct update_field *field)
+{
+ assert(field->type == UPDATE_ARRAY);
+ struct rope *rope = field->array.rope;
+ if (update_op_adjust_field_no(op, rope_size(rope) + 1) != 0)
+ return -1;
+
+ struct update_array_item *item =
+ (struct update_array_item *) rope_alloc(rope->ctx,
+ sizeof(*item));
+ if (item == NULL)
+ return -1;
+ update_array_item_create(item, UPDATE_NOP, op->arg.set.value,
+ op->arg.set.length, 0);
+ return rope_insert(rope, op->field_no, item, 1);
+}
+
+int
+do_op_array_set(struct update_op *op, struct update_field *field)
+{
+ assert(field->type == UPDATE_ARRAY);
+ struct rope *rope = field->array.rope;
+ /* Interpret '=' for n + 1 field as insert. */
+ if (op->field_no == (int32_t) rope_size(rope))
+ return do_op_array_insert(op, field);
+
+ struct update_array_item *item =
+ update_array_extract_item(field, op);
+ if (item == NULL)
+ return -1;
+ op->new_field_len = op->arg.set.length;
+ /* Ignore the previous op, if any. */
+ item->field.type = UPDATE_SCALAR;
+ item->field.scalar.op = op;
+ return 0;
+}
+
+int
+do_op_array_delete(struct update_op *op, struct update_field *field)
+{
+ assert(field->type == UPDATE_ARRAY);
+ struct rope *rope = field->array.rope;
+ uint32_t size = rope_size(rope);
+ if (update_op_adjust_field_no(op, size) != 0)
+ return -1;
+ uint32_t delete_count = op->arg.del.count;
+ if ((uint64_t) op->field_no + delete_count > size)
+ delete_count = size - op->field_no;
+ for (uint32_t u = delete_count; u != 0; --u)
+ rope_erase(rope, op->field_no);
+ return 0;
+}
+
+#define DO_SCALAR_OP_GENERIC(op_type) \
+int \
+do_op_array_##op_type(struct update_op *op, struct update_field *field) \
+{ \
+ struct update_array_item *item = \
+ update_array_extract_item(field, op); \
+ if (item == NULL) \
+ return -1; \
+ if (item->field.type != UPDATE_NOP) \
+ return update_err_double(op); \
+ if (update_op_do_##op_type(op, item->field.data) != 0) \
+ return -1; \
+ item->field.type = UPDATE_SCALAR; \
+ item->field.scalar.op = op; \
+ return 0; \
+}
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
diff --git a/src/box/update/update_field.c b/src/box/update/update_field.c
new file mode 100644
index 000000000..b4ede54db
--- /dev/null
+++ b/src/box/update/update_field.c
@@ -0,0 +1,649 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "update_field.h"
+#include "box/tuple_format.h"
+#include "mp_extension_types.h"
+
+/* {{{ Error helpers. */
+
+/**
+ * Take a string identifier of a field being updated by @a op.
+ * @param op Operation to convert field of.
+ * @return String with field ID.
+ */
+static inline const char *
+update_op_field_str(const struct update_op *op)
+{
+ if (op->field_no >= 0)
+ return tt_sprintf("%d", op->field_no + TUPLE_INDEX_BASE);
+ else
+ return tt_sprintf("%d", op->field_no);
+}
+
+static inline int
+update_err_arg_type(const struct update_op *op, const char *needed_type)
+{
+ diag_set(ClientError, ER_UPDATE_ARG_TYPE, op->opcode,
+ update_op_field_str(op), needed_type);
+ return -1;
+}
+
+static inline int
+update_err_int_overflow(const struct update_op *op)
+{
+ diag_set(ClientError, ER_UPDATE_INTEGER_OVERFLOW, op->opcode,
+ update_op_field_str(op));
+ return -1;
+}
+
+static inline int
+update_err_decimal_overflow(const struct update_op *op)
+{
+ diag_set(ClientError, ER_UPDATE_DECIMAL_OVERFLOW, op->opcode,
+ update_op_field_str(op));
+ return -1;
+}
+
+static inline int
+update_err_splice_bound(const struct update_op *op)
+{
+ diag_set(ClientError, ER_UPDATE_SPLICE, update_op_field_str(op),
+ "offset is out of bound");
+ return -1;
+}
+
+int
+update_err_no_such_field(const struct update_op *op)
+{
+ diag_set(ClientError, ER_NO_SUCH_FIELD_NO, op->field_no >= 0 ?
+ TUPLE_INDEX_BASE + op->field_no : op->field_no);
+ return -1;
+}
+
+int
+update_err(const struct update_op *op, const char *reason)
+{
+ diag_set(ClientError, ER_UPDATE_FIELD, update_op_field_str(op),
+ reason);
+ return -1;
+}
+
+/* }}} Error helpers. */
+
+uint32_t
+update_field_sizeof(struct update_field *field)
+{
+ switch (field->type) {
+ case UPDATE_NOP:
+ return field->size;
+ case UPDATE_SCALAR:
+ return field->scalar.op->new_field_len;
+ case UPDATE_ARRAY:
+ return update_array_sizeof(field);
+ default:
+ unreachable();
+ }
+ return 0;
+}
+
+uint32_t
+update_field_store(struct update_field *field, char *out, char *out_end)
+{
+ struct update_op *op;
+ uint32_t size;
+ switch(field->type) {
+ case UPDATE_NOP:
+ assert(out_end - out >= field->size);
+ memcpy(out, field->data, field->size);
+ return field->size;
+ case UPDATE_SCALAR:
+ op = field->scalar.op;
+ size = op->new_field_len;
+ assert(out_end - out >= size);
+ op->meta->store_cb(op, field->data, out);
+ return size;
+ case UPDATE_ARRAY:
+ return update_array_store(field, out, out_end);
+ default:
+ unreachable();
+ }
+ return 0;
+}
+
+/* {{{ read_arg helpers. */
+
+static inline int
+mp_read_i32(struct update_op *op, const char **expr, int32_t *ret)
+{
+ if (mp_read_int32(expr, ret) == 0)
+ return 0;
+ return update_err_arg_type(op, "an integer");
+}
+
+static inline int
+mp_read_uint(struct update_op *op, const char **expr, uint64_t *ret)
+{
+ if (mp_typeof(**expr) == MP_UINT) {
+ *ret = mp_decode_uint(expr);
+ return 0;
+ }
+ return update_err_arg_type(op, "a positive integer");
+}
+
+static inline int
+mp_read_arith_arg(struct update_op *op, const char **expr,
+ struct op_arith_arg *ret)
+{
+ int8_t ext_type;
+ uint32_t len;
+ switch(mp_typeof(**expr)) {
+ case MP_UINT:
+ ret->type = AT_INT;
+ int96_set_unsigned(&ret->int96, mp_decode_uint(expr));
+ return 0;
+ case MP_INT:
+ ret->type = AT_INT;
+ int96_set_signed(&ret->int96, mp_decode_int(expr));
+ return 0;
+ case MP_DOUBLE:
+ ret->type = AT_DOUBLE;
+ ret->dbl = mp_decode_double(expr);
+ return 0;
+ case MP_FLOAT:
+ ret->type = AT_FLOAT;
+ ret->flt = mp_decode_float(expr);
+ return 0;
+ case MP_EXT:
+ len = mp_decode_extl(expr, &ext_type);
+ if (ext_type == MP_DECIMAL) {
+ ret->type = AT_DECIMAL;
+ decimal_unpack(expr, len, &ret->dec);
+ return 0;
+ }
+ FALLTHROUGH;
+ default:
+ return update_err_arg_type(op, "a number");
+ }
+}
+
+static inline int
+mp_read_str(struct update_op *op, const char **expr, uint32_t *len,
+ const char **ret)
+{
+ if (mp_typeof(**expr) == MP_STR) {
+ *ret = mp_decode_str(expr, len);
+ return 0;
+ }
+ return update_err_arg_type(op, "a string");
+}
+
+/* }}} read_arg helpers. */
+
+/* {{{ read_arg */
+
+static int
+read_arg_set(struct update_op *op, const char **expr, int index_base)
+{
+ (void) index_base;
+ op->arg.set.value = *expr;
+ mp_next(expr);
+ op->arg.set.length = (uint32_t) (*expr - op->arg.set.value);
+ return 0;
+}
+
+static int
+read_arg_insert(struct update_op *op, const char **expr, int index_base)
+{
+ return read_arg_set(op, expr, index_base);
+}
+
+static int
+read_arg_delete(struct update_op *op, const char **expr, int index_base)
+{
+ (void) index_base;
+ if (mp_typeof(**expr) == MP_UINT) {
+ op->arg.del.count = (uint32_t) mp_decode_uint(expr);
+ if (op->arg.del.count != 0)
+ return 0;
+ return update_err(op, "cannot delete 0 fields");
+ }
+ return update_err_arg_type(op, "a number of fields to delete");
+}
+
+static int
+read_arg_arith(struct update_op *op, const char **expr, int index_base)
+{
+ (void) index_base;
+ return mp_read_arith_arg(op, expr, &op->arg.arith);
+}
+
+static int
+read_arg_bit(struct update_op *op, const char **expr, int index_base)
+{
+ (void) index_base;
+ return mp_read_uint(op, expr, &op->arg.bit.val);
+}
+
+static int
+read_arg_splice(struct update_op *op, const char **expr, int index_base)
+{
+ struct op_splice_arg *arg = &op->arg.splice;
+ if (mp_read_i32(op, expr, &arg->offset))
+ return -1;
+ if (arg->offset >= 0) {
+ if (arg->offset - index_base < 0)
+ return update_err_splice_bound(op);
+ arg->offset -= index_base;
+ }
+ if (mp_read_i32(op, expr, &arg->cut_length) == 0)
+ return mp_read_str(op, expr, &arg->paste_length, &arg->paste);
+ return -1;
+}
+
+/* }}} read_arg */
+
+/* {{{ do_op helpers. */
+
+static inline double
+cast_arith_arg_to_double(struct op_arith_arg arg)
+{
+ if (arg.type == AT_DOUBLE) {
+ return arg.dbl;
+ } else if (arg.type == AT_FLOAT) {
+ return arg.flt;
+ } else {
+ assert(arg.type == AT_INT);
+ if (int96_is_uint64(&arg.int96)) {
+ return int96_extract_uint64(&arg.int96);
+ } else {
+ assert(int96_is_neg_int64(&arg.int96));
+ return int96_extract_neg_int64(&arg.int96);
+ }
+ }
+}
+
+static inline decimal_t *
+cast_arith_arg_to_decimal(struct op_arith_arg arg, decimal_t *dec)
+{
+ if (arg.type == AT_DECIMAL) {
+ *dec = arg.dec;
+ return dec;
+ } else if (arg.type == AT_DOUBLE) {
+ return decimal_from_double(dec, arg.dbl);
+ } else if (arg.type == AT_FLOAT) {
+ return decimal_from_double(dec, arg.flt);
+ } else {
+ assert(arg.type == AT_INT);
+ if (int96_is_uint64(&arg.int96)) {
+ uint64_t val = int96_extract_uint64(&arg.int96);
+ return decimal_from_uint64(dec, val);
+ } else {
+ assert(int96_is_neg_int64(&arg.int96));
+ int64_t val = int96_extract_neg_int64(&arg.int96);
+ return decimal_from_int64(dec, val);
+ }
+ }
+}
+
+uint32_t
+update_arith_sizeof(struct op_arith_arg *arg)
+{
+ switch (arg->type) {
+ case AT_INT:
+ if (int96_is_uint64(&arg->int96)) {
+ uint64_t val = int96_extract_uint64(&arg->int96);
+ return mp_sizeof_uint(val);
+ } else {
+ int64_t val = int96_extract_neg_int64(&arg->int96);
+ return mp_sizeof_int(val);
+ }
+ break;
+ case AT_DOUBLE:
+ return mp_sizeof_double(arg->dbl);
+ case AT_FLOAT:
+ return mp_sizeof_float(arg->flt);
+ default:
+ assert(arg->type == AT_DECIMAL);
+ return mp_sizeof_decimal(&arg->dec);
+ }
+}
+
+int
+make_arith_operation(struct update_op *op, struct op_arith_arg arg,
+ struct op_arith_arg *ret)
+{
+ struct op_arith_arg value = op->arg.arith;
+ enum arith_type lowest_type = arg.type;
+ if (arg.type > value.type)
+ lowest_type = value.type;
+
+ if (lowest_type == AT_INT) {
+ switch(op->opcode) {
+ case '+':
+ int96_add(&arg.int96, &value.int96);
+ break;
+ case '-':
+ int96_invert(&value.int96);
+ int96_add(&arg.int96, &value.int96);
+ break;
+ default:
+ unreachable();
+ break;
+ }
+ if (!int96_is_uint64(&arg.int96) &&
+ !int96_is_neg_int64(&arg.int96))
+ return update_err_int_overflow(op);
+ *ret = arg;
+ } else if (lowest_type >= AT_DOUBLE) {
+ /* At least one of operands is double or float. */
+ double a = cast_arith_arg_to_double(arg);
+ double b = cast_arith_arg_to_double(value);
+ double c;
+ switch(op->opcode) {
+ case '+':
+ c = a + b;
+ break;
+ case '-':
+ c = a - b;
+ break;
+ default:
+ unreachable();
+ break;
+ }
+ if (lowest_type == AT_DOUBLE) {
+ ret->type = AT_DOUBLE;
+ ret->dbl = c;
+ } else {
+ assert(lowest_type == AT_FLOAT);
+ ret->type = AT_FLOAT;
+ ret->flt = (float)c;
+ }
+ } else {
+ decimal_t a, b, c;
+ if (! cast_arith_arg_to_decimal(arg, &a) ||
+ ! cast_arith_arg_to_decimal(value, &b)) {
+ return update_err_arg_type(op, "a number convertible "\
+ "to decimal");
+ }
+ switch(op->opcode) {
+ case '+':
+ if (decimal_add(&c, &a, &b) == NULL)
+ return update_err_decimal_overflow(op);
+ break;
+ case '-':
+ if (decimal_sub(&c, &a, &b) == NULL)
+ return update_err_decimal_overflow(op);
+ break;
+ default:
+ unreachable();
+ break;
+ }
+ ret->type = AT_DECIMAL;
+ ret->dec = c;
+ }
+ return 0;
+}
+
+int
+update_op_do_arith(struct update_op *op, const char *old)
+{
+ struct op_arith_arg left_arg;
+ if (mp_read_arith_arg(op, &old, &left_arg) != 0 ||
+ make_arith_operation(op, left_arg, &op->arg.arith) != 0)
+ return -1;
+ op->new_field_len = update_arith_sizeof(&op->arg.arith);
+ return 0;
+}
+
+int
+update_op_do_bit(struct update_op *op, const char *old)
+{
+ uint64_t val;
+ if (mp_read_uint(op, &old, &val) != 0)
+ return -1;
+ struct op_bit_arg *arg = &op->arg.bit;
+ switch (op->opcode) {
+ case '&':
+ arg->val &= val;
+ break;
+ case '^':
+ arg->val ^= val;
+ break;
+ case '|':
+ arg->val |= val;
+ break;
+ default:
+ /* Checked by update_read_ops. */
+ unreachable();
+ }
+ op->new_field_len = mp_sizeof_uint(arg->val);
+ return 0;
+}
+
+int
+update_op_do_splice(struct update_op *op, const char *old)
+{
+ struct op_splice_arg *arg = &op->arg.splice;
+ int32_t str_len;
+ if (mp_read_str(op, &old, (uint32_t *) &str_len, &old) != 0)
+ return -1;
+
+ if (arg->offset < 0) {
+ if (-arg->offset > str_len + 1)
+ return update_err_splice_bound(op);
+ arg->offset += str_len + 1;
+ } else if (arg->offset > str_len) {
+ arg->offset = str_len;
+ }
+ assert(arg->offset >= 0 && arg->offset <= str_len);
+ if (arg->cut_length < 0) {
+ if (-arg->cut_length > (str_len - arg->offset))
+ arg->cut_length = 0;
+ else
+ arg->cut_length += str_len - arg->offset;
+ } else if (arg->cut_length > str_len - arg->offset) {
+ arg->cut_length = str_len - arg->offset;
+ }
+ assert(arg->offset <= str_len);
+
+ arg->tail_offset = arg->offset + arg->cut_length;
+ arg->tail_length = str_len - arg->tail_offset;
+ op->new_field_len = mp_sizeof_str(arg->offset + arg->paste_length +
+ arg->tail_length);
+ return 0;
+}
+
+/* }}} do_op helpers. */
+
+/* {{{ store_op */
+
+static void
+store_op_set(struct update_op *op, const char *in, char *out)
+{
+ (void) in;
+ memcpy(out, op->arg.set.value, op->arg.set.length);
+}
+
+void
+store_op_arith(struct update_op *op, const char *in, char *out)
+{
+ (void) in;
+ struct op_arith_arg *arg = &op->arg.arith;
+ switch (arg->type) {
+ case AT_INT:
+ if (int96_is_uint64(&arg->int96)) {
+ mp_encode_uint(out, int96_extract_uint64(&arg->int96));
+ } else {
+ assert(int96_is_neg_int64(&arg->int96));
+ mp_encode_int(out, int96_extract_neg_int64(&arg->int96));
+ }
+ break;
+ case AT_DOUBLE:
+ mp_encode_double(out, arg->dbl);
+ break;
+ case AT_FLOAT:
+ mp_encode_float(out, arg->flt);
+ break;
+ default:
+ assert(arg->type == AT_DECIMAL);
+ mp_encode_decimal(out, &arg->dec);
+ break;
+ }
+}
+
+static void
+store_op_bit(struct update_op *op, const char *in, char *out)
+{
+ (void) in;
+ mp_encode_uint(out, op->arg.bit.val);
+}
+
+static void
+store_op_splice(struct update_op *op, const char *in, char *out)
+{
+ struct op_splice_arg *arg = &op->arg.splice;
+ uint32_t new_str_len = arg->offset + arg->paste_length +
+ arg->tail_length;
+ (void) mp_decode_strl(&in);
+ out = mp_encode_strl(out, new_str_len);
+ /* Copy field head. */
+ memcpy(out, in, arg->offset);
+ out = out + arg->offset;
+ /* Copy the paste. */
+ memcpy(out, arg->paste, arg->paste_length);
+ out = out + arg->paste_length;
+ /* Copy tail. */
+ memcpy(out, in + arg->tail_offset, arg->tail_length);
+}
+
+/* }}} store_op */
+
+static const struct update_op_meta op_set =
+ { read_arg_set, do_op_set, store_op_set, 3 };
+static const struct update_op_meta op_insert =
+ { read_arg_insert, do_op_insert, store_op_set, 3 };
+static const struct update_op_meta op_arith =
+ { read_arg_arith, do_op_arith, store_op_arith, 3 };
+static const struct update_op_meta op_bit =
+ { read_arg_bit, do_op_bit, store_op_bit, 3 };
+static const struct update_op_meta op_splice =
+ { read_arg_splice, do_op_splice, store_op_splice, 5 };
+static const struct update_op_meta op_delete =
+ { read_arg_delete, do_op_delete, NULL, 3 };
+
+static inline const struct update_op_meta *
+update_op_by(char opcode)
+{
+ switch (opcode) {
+ case '=':
+ return &op_set;
+ case '+':
+ case '-':
+ return &op_arith;
+ case '&':
+ case '|':
+ case '^':
+ return &op_bit;
+ case ':':
+ return &op_splice;
+ case '#':
+ return &op_delete;
+ case '!':
+ return &op_insert;
+ default:
+ diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
+ return NULL;
+ }
+}
+
+int
+update_op_decode(struct update_op *op, int index_base,
+ struct tuple_dictionary *dict, const char **expr)
+{
+ if (mp_typeof(**expr) != MP_ARRAY) {
+ diag_set(ClientError, ER_ILLEGAL_PARAMS, "update operation "
+ "must be an array {op,..}");
+ return -1;
+ }
+ uint32_t len, arg_count = mp_decode_array(expr);
+ if (arg_count < 1) {
+ diag_set(ClientError, ER_ILLEGAL_PARAMS, "update operation "\
+ "must be an array {op,..}, got empty array");
+ return -1;
+ }
+ if (mp_typeof(**expr) != MP_STR) {
+ diag_set(ClientError, ER_ILLEGAL_PARAMS,
+ "update operation name must be a string");
+ return -1;
+ }
+ op->opcode = *mp_decode_str(expr, &len);
+ op->meta = update_op_by(op->opcode);
+ if (op->meta == NULL)
+ return -1;
+ if (arg_count != op->meta->arg_count) {
+ diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
+ return -1;
+ }
+ int32_t field_no;
+ switch(mp_typeof(**expr)) {
+ case MP_INT:
+ case MP_UINT: {
+ if (mp_read_i32(op, expr, &field_no) != 0)
+ return -1;
+ if (field_no - index_base >= 0) {
+ op->field_no = field_no - index_base;
+ } else if (field_no < 0) {
+ op->field_no = field_no;
+ } else {
+ diag_set(ClientError, ER_NO_SUCH_FIELD_NO, field_no);
+ return -1;
+ }
+ break;
+ }
+ case MP_STR: {
+ const char *path = mp_decode_str(expr, &len);
+ uint32_t field_no, hash = field_name_hash(path, len);
+ if (tuple_fieldno_by_name(dict, path, len, hash,
+ &field_no) == 0) {
+ op->field_no = (int32_t) field_no;
+ break;
+ }
+ diag_set(ClientError, ER_NO_SUCH_FIELD_NAME,
+ tt_cstr(path, len));
+ return -1;
+ }
+ default:
+ diag_set(ClientError, ER_ILLEGAL_PARAMS,
+ "field id must be a number or a string");
+ return -1;
+ }
+ return op->meta->read_arg_cb(op, expr, index_base);
+}
diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h
new file mode 100644
index 000000000..87f1d0b74
--- /dev/null
+++ b/src/box/update/update_field.h
@@ -0,0 +1,410 @@
+#ifndef TARANTOOL_BOX_TUPLE_UPDATE_FIELD_H
+#define TARANTOOL_BOX_TUPLE_UPDATE_FIELD_H
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "trivia/util.h"
+#include "tt_static.h"
+#include <stdint.h>
+#include "bit/int96.h"
+#include "mp_decimal.h"
+
+/**
+ * This file is a link between all the update operations for all
+ * the field types. It functions like a router, when an update
+ * operation is being parsed step by step, going down the update
+ * tree. For example, when an update operation goes through an
+ * array, a map, another array, and ends with a scalar operation,
+ * at the end of each step the operation goes to the next one via
+ * functions of this file: update_array.c -> update_field.c ->
+ * update_map.c -> update_field.c -> update_array.c -> ... . The
+ * functions, doing the routing, are do_op_<operation_type>(), see
+ * them below.
+ */
+
+struct rope;
+struct update_field;
+struct update_op;
+struct tuple_dictionary;
+
+/* {{{ update_op */
+
+/** Argument of SET (and INSERT) operation. */
+struct op_set_arg {
+ uint32_t length;
+ const char *value;
+};
+
+/** Argument of DELETE operation. */
+struct op_del_arg {
+ uint32_t count;
+};
+
+/**
+ * MsgPack format code of an arithmetic argument or result.
+ * MsgPack codes are not used to simplify type calculation.
+ */
+enum arith_type {
+ AT_DECIMAL = 0, /* MP_EXT + MP_DECIMAL */
+ AT_DOUBLE = 1, /* MP_DOUBLE */
+ AT_FLOAT = 2, /* MP_FLOAT */
+ AT_INT = 3 /* MP_INT/MP_UINT */
+};
+
+/**
+ * Argument (left and right) and result of ADD, SUBTRACT.
+ *
+ * To perform an arithmetic operation, update first loads left
+ * and right arguments into corresponding value objects, then
+ * performs arithmetics on types of arguments, thus calculating
+ * the type of the result, and then performs the requested
+ * operation according to the calculated type rules.
+ *
+ * The rules are as follows:
+ * - when one of the argument types is double, the result is
+ * double;
+ * - when one of the argument types is float, the result is float;
+ * - for integer arguments, the result type code depends on the
+ * range in which falls the result of the operation. If the
+ * result is in negative range, it's MP_INT, otherwise it's
+ * MP_UINT. If the result is out of bounds of (-2^63, 2^64), an
+ * exception is raised for overflow.
+ */
+struct op_arith_arg {
+ enum arith_type type;
+ union {
+ double dbl;
+ float flt;
+ struct int96_num int96;
+ decimal_t dec;
+ };
+};
+
+/** Argument of AND, XOR, OR operations. */
+struct op_bit_arg {
+ uint64_t val;
+};
+
+/** Argument of SPLICE. */
+struct op_splice_arg {
+ /** Splice position. */
+ int32_t offset;
+ /** Byte count to delete. */
+ int32_t cut_length;
+ /** New content. */
+ const char *paste;
+ /** New content length. */
+ uint32_t paste_length;
+
+ /** Offset of the tail in the old field. */
+ int32_t tail_offset;
+ /** Size of the tail. */
+ int32_t tail_length;
+};
+
+/** Update operation argument. */
+union update_op_arg {
+ struct op_set_arg set;
+ struct op_del_arg del;
+ struct op_arith_arg arith;
+ struct op_bit_arg bit;
+ struct op_splice_arg splice;
+};
+
+typedef int
+(*update_op_do_f)(struct update_op *op, struct update_field *field);
+
+typedef int
+(*update_op_read_arg_f)(struct update_op *op, const char **expr,
+ int index_base);
+
+typedef void
+(*update_op_store_f)(struct update_op *op, const char *in, char *out);
+
+/**
+ * A set of functions and properties to initialize, do and store
+ * an operation.
+ */
+struct update_op_meta {
+ /**
+ * Virtual function to read the arguments of the
+ * operation.
+ */
+ update_op_read_arg_f read_arg_cb;
+ /** Virtual function to execute the operation. */
+ update_op_do_f do_cb;
+ /**
+ * Virtual function to store a result of the operation.
+ */
+ update_op_store_f store_cb;
+ /** Argument count. */
+ uint32_t arg_count;
+};
+
+/** A single UPDATE operation. */
+struct update_op {
+ /** Operation meta depending on the op type. */
+ const struct update_op_meta *meta;
+ /** Operation arguments. */
+ union update_op_arg arg;
+ /** First level field no. */
+ int32_t field_no;
+ /** Size of a new field after it is updated. */
+ uint32_t new_field_len;
+ /** Opcode symbol: = + - / ... */
+ char opcode;
+};
+
+/**
+ * Decode an update operation from MessagePack.
+ * @param[out] op Update operation.
+ * @param index_base Field numbers base: 0 or 1.
+ * @param dict Dictionary to lookup field number by a name.
+ * @param expr MessagePack.
+ *
+ * @retval 0 Success.
+ * @retval -1 Client error.
+ */
+int
+update_op_decode(struct update_op *op, int index_base,
+ struct tuple_dictionary *dict, const char **expr);
+
+/* }}} update_op */
+
+/* {{{ update_field */
+
+/** Types of field update. */
+enum update_type {
+ /**
+ * Field is not updated. Just save it as is. It is used,
+ * for example, when a rope is split in two parts: not
+ * changed left range of fields, and a right range with
+ * its first field changed. In this case the left range is
+ * NOP.
+ */
+ UPDATE_NOP,
+ /**
+ * Field is a scalar value, updated via set, arith, bit,
+ * splice, or any other scalar-applicable operation.
+ */
+ UPDATE_SCALAR,
+ /**
+ * Field is an updated array. Check the rope for updates
+ * of individual fields.
+ */
+ UPDATE_ARRAY,
+};
+
+/**
+ * Generic structure describing update of a field. It can be
+ * tuple field, field of a tuple field, or any another tuple
+ * internal object: map, array, scalar, or unchanged field of any
+ * type without op. This is a node of the whole update tree.
+ *
+ * If the field is array or map, it contains more children fields
+ * inside corresponding sub-structures.
+ */
+struct update_field {
+ /**
+ * Type of this field's update. Union below depends on it.
+ */
+ enum update_type type;
+ /** Field data to update. */
+ const char *data;
+ uint32_t size;
+ union {
+ /**
+ * This update is terminal. It does a scalar
+ * operation and has no children fields.
+ */
+ struct {
+ struct update_op *op;
+ } scalar;
+ /**
+ * This update changes an array. Children fields
+ * are stored in rope nodes.
+ */
+ struct {
+ struct rope *rope;
+ } array;
+ };
+};
+
+/**
+ * Size of the updated field, including all children recursively.
+ */
+uint32_t
+update_field_sizeof(struct update_field *field);
+
+/** Save the updated field, including all children recursively. */
+uint32_t
+update_field_store(struct update_field *field, char *out, char *out_end);
+
+/**
+ * Generate declarations for a concrete field type: array, bar
+ * etc. Each complex type has basic operations of the same
+ * signature: insert, set, delete, arith, bit, splice.
+ *
+ * These operations can call each other going down a JSON path.
+ */
+#define OP_DECL_GENERIC(type) \
+int \
+do_op_##type##_insert(struct update_op *op, struct update_field *field); \
+ \
+int \
+do_op_##type##_set(struct update_op *op, struct update_field *field); \
+ \
+int \
+do_op_##type##_delete(struct update_op *op, struct update_field *field); \
+ \
+int \
+do_op_##type##_arith(struct update_op *op, struct update_field *field); \
+ \
+int \
+do_op_##type##_bit(struct update_op *op, struct update_field *field); \
+ \
+int \
+do_op_##type##_splice(struct update_op *op, struct update_field *field); \
+ \
+uint32_t \
+update_##type##_sizeof(struct update_field *field); \
+ \
+uint32_t \
+update_##type##_store(struct update_field *field, char *out, char *out_end);
+
+/* }}} update_field */
+
+/* {{{ update_field.array */
+
+/**
+ * Initialize @a field as an array to update.
+ * @param[out] field Field to initialize.
+ * @param data MessagePack data of the array to update.
+ * @param data_end End of @a data.
+ * @param field_count Field count in @data.
+ *
+ * @retval 0 Success.
+ * @retval -1 Error.
+ */
+int
+update_array_create(struct update_field *field, const char *data,
+ const char *data_end, uint32_t field_count);
+
+OP_DECL_GENERIC(array)
+
+/* }}} update_field.array */
+
+#undef OP_DECL_GENERIC
+
+/* {{{ Common helpers. */
+
+/**
+ * These helper functions are used when a caller function doesn't
+ * know type of a child node to propagate an update down the tree.
+ * For example, do_op_array_set calls do_op_set on its childs.
+ * Each child can be another array, a bar, a route, a map -
+ * anything. The functions below help to make such places shorter
+ * and simpler.
+ */
+#define OP_DECL_GENERIC(op_type) \
+static inline int \
+do_op_##op_type(struct update_op *op, struct update_field *field) \
+{ \
+ switch (field->type) { \
+ case UPDATE_ARRAY: \
+ return do_op_array_##op_type(op, field); \
+ default: \
+ unreachable(); \
+ } \
+ return 0; \
+}
+
+OP_DECL_GENERIC(insert)
+
+OP_DECL_GENERIC(set)
+
+OP_DECL_GENERIC(delete)
+
+OP_DECL_GENERIC(arith)
+
+OP_DECL_GENERIC(bit)
+
+OP_DECL_GENERIC(splice)
+
+#undef OP_DECL_GENERIC
+
+/* }}} Common helpers. */
+
+/* {{{ Scalar helpers. */
+
+int
+make_arith_operation(struct update_op *op, struct op_arith_arg arg,
+ struct op_arith_arg *ret);
+
+void
+store_op_arith(struct update_op *op, const char *in, char *out);
+
+uint32_t
+update_arith_sizeof(struct op_arith_arg *arg);
+
+int
+update_op_do_arith(struct update_op *op, const char *old);
+
+int
+update_op_do_bit(struct update_op *op, const char *old);
+
+int
+update_op_do_splice(struct update_op *op, const char *old);
+
+/* }}} Scalar helpers. */
+
+/** {{{ Error helpers. */
+/**
+ * All the error helpers below set diag with appropriate error
+ * code, taking into account field_no < 0, complex paths. They all
+ * return -1 to shorten error returning in a caller function to
+ * single line.
+ */
+
+int
+update_err_no_such_field(const struct update_op *op);
+
+int
+update_err(const struct update_op *op, const char *reason);
+
+static inline int
+update_err_double(const struct update_op *op)
+{
+ return update_err(op, "double update of the same field");
+}
+
+/** }}} Error helpers. */
+
+#endif /* TARANTOOL_BOX_TUPLE_UPDATE_FIELD_H */
diff --git a/test/box/misc.result b/test/box/misc.result
index c46c5a9d6..125d9139d 100644
--- a/test/box/misc.result
+++ b/test/box/misc.result
@@ -356,7 +356,7 @@ t;
22: box.error.TUPLE_NOT_ARRAY
23: box.error.FIELD_TYPE
24: box.error.INDEX_PART_TYPE_MISMATCH
- 25: box.error.SPLICE
+ 25: box.error.UPDATE_SPLICE
26: box.error.UPDATE_ARG_TYPE
27: box.error.FORMAT_MISMATCH_INDEX_PART
28: box.error.UNKNOWN_UPDATE_OP
diff --git a/test/box/tuple.result b/test/box/tuple.result
index 895462518..c748c886a 100644
--- a/test/box/tuple.result
+++ b/test/box/tuple.result
@@ -1371,13 +1371,13 @@ d:update{{'-', 1, ffi.new('float', 635)}}
d:update{{'+', 1, 0/0}}
---
- error: 'Argument type in operation ''+'' on field 1 does not match field type: expected
- a number convertible to decimal.'
+ a number convertible to decimal'
...
-- inf
d:update{{'-', 1, 1/0}}
---
- error: 'Argument type in operation ''-'' on field 1 does not match field type: expected
- a number convertible to decimal.'
+ a number convertible to decimal'
...
-- decimal overflow
d = box.tuple.new(dec.new('9e37'))
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v2 3/8] json: lexer_eof and token_cmp helper functions
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 1/8] tuple: expose JSON go_to_key and go_to_index functions Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 2/8] tuple: rework updates to improve code extendibility Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
[not found] ` <20190903192433.GF15611@atlas>
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 4/8] tuple: account the whole array in field.data and size Vladislav Shpilevoy
` (4 subsequent siblings)
7 siblings, 1 reply; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
They are needed in incoming JSON updates, which are going to
solve a task of comparison of two JSON paths, their simultaneous
parsing, and digging into a tuple.
json_token_cmp() existed before this patch, but it was trying to
compare parent pointers too, which is not needed in the JSON
updates, since they won't use JSON trees.
Needed for #1261
---
src/lib/json/json.c | 37 +++++++++++++------------------------
src/lib/json/json.h | 28 ++++++++++++++++++++++++++++
2 files changed, 41 insertions(+), 24 deletions(-)
diff --git a/src/lib/json/json.c b/src/lib/json/json.c
index 1bfef172a..416c7dfda 100644
--- a/src/lib/json/json.c
+++ b/src/lib/json/json.c
@@ -55,7 +55,7 @@
static inline int
json_read_symbol(struct json_lexer *lexer, UChar32 *out)
{
- if (lexer->offset == lexer->src_len) {
+ if (json_lexer_is_eof(lexer)) {
*out = U_SENTINEL;
return lexer->symbol_count + 1;
}
@@ -211,7 +211,7 @@ json_parse_identifier(struct json_lexer *lexer, struct json_token *token)
int
json_lexer_next_token(struct json_lexer *lexer, struct json_token *token)
{
- if (lexer->offset == lexer->src_len) {
+ if (json_lexer_is_eof(lexer)) {
token->type = JSON_TOKEN_END;
return 0;
}
@@ -223,7 +223,7 @@ json_lexer_next_token(struct json_lexer *lexer, struct json_token *token)
switch(c) {
case (UChar32)'[':
/* Error for '[\0'. */
- if (lexer->offset == lexer->src_len)
+ if (json_lexer_is_eof(lexer))
return lexer->symbol_count;
c = json_current_char(lexer);
if (c == '"' || c == '\'') {
@@ -240,14 +240,14 @@ json_lexer_next_token(struct json_lexer *lexer, struct json_token *token)
* Expression, started from [ must be finished
* with ] regardless of its type.
*/
- if (lexer->offset == lexer->src_len ||
+ if (json_lexer_is_eof(lexer) ||
json_current_char(lexer) != ']')
return lexer->symbol_count + 1;
/* Skip ] - one byte char. */
json_skip_char(lexer);
return 0;
case (UChar32)'.':
- if (lexer->offset == lexer->src_len)
+ if (json_lexer_is_eof(lexer))
return lexer->symbol_count + 1;
return json_parse_identifier(lexer, token);
default:
@@ -259,26 +259,15 @@ json_lexer_next_token(struct json_lexer *lexer, struct json_token *token)
}
/**
- * Compare JSON token keys.
+ * Compare JSON tokens as nodes of a JSON tree. That is, including
+ * parent references.
*/
static int
-json_token_cmp(const struct json_token *a, const struct json_token *b)
+json_token_cmp_in_tree(const struct json_token *a, const struct json_token *b)
{
if (a->parent != b->parent)
return a->parent - b->parent;
- if (a->type != b->type)
- return a->type - b->type;
- int ret = 0;
- if (a->type == JSON_TOKEN_STR) {
- if (a->len != b->len)
- return a->len - b->len;
- ret = memcmp(a->str, b->str, a->len);
- } else if (a->type == JSON_TOKEN_NUM) {
- ret = a->num - b->num;
- } else {
- assert(a->type == JSON_TOKEN_ANY);
- }
- return ret;
+ return json_token_cmp(a, b);
}
int
@@ -289,7 +278,7 @@ json_path_cmp(const char *a, int a_len, const char *b, int b_len,
json_lexer_create(&lexer_a, a, a_len, index_base);
json_lexer_create(&lexer_b, b, b_len, index_base);
struct json_token token_a, token_b;
- /* For the sake of json_token_cmp(). */
+ /* For the sake of json_token_cmp_in_tree(). */
token_a.parent = NULL;
token_b.parent = NULL;
int rc_a, rc_b;
@@ -297,7 +286,7 @@ json_path_cmp(const char *a, int a_len, const char *b, int b_len,
(rc_b = json_lexer_next_token(&lexer_b, &token_b)) == 0 &&
token_a.type != JSON_TOKEN_END &&
token_b.type != JSON_TOKEN_END) {
- int rc = json_token_cmp(&token_a, &token_b);
+ int rc = json_token_cmp_in_tree(&token_a, &token_b);
if (rc != 0)
return rc;
}
@@ -423,8 +412,8 @@ json_tree_snprint_path(char *buf, int size, const struct json_token *token,
#define mh_arg_t void *
#define mh_hash(a, arg) ((*(a))->hash)
#define mh_hash_key(a, arg) ((a)->hash)
-#define mh_cmp(a, b, arg) (json_token_cmp(*(a), *(b)))
-#define mh_cmp_key(a, b, arg) (json_token_cmp((a), *(b)))
+#define mh_cmp(a, b, arg) (json_token_cmp_in_tree(*(a), *(b)))
+#define mh_cmp_key(a, b, arg) (json_token_cmp_in_tree((a), *(b)))
#include "salad/mhash.h"
static const uint32_t hash_seed = 13U;
diff --git a/src/lib/json/json.h b/src/lib/json/json.h
index d66a9c7a4..08a0ee96c 100644
--- a/src/lib/json/json.h
+++ b/src/lib/json/json.h
@@ -241,6 +241,13 @@ json_lexer_create(struct json_lexer *lexer, const char *src, int src_len,
int
json_lexer_next_token(struct json_lexer *lexer, struct json_token *token);
+/** Check if @a lexer has finished parsing. */
+static inline bool
+json_lexer_is_eof(const struct json_lexer *lexer)
+{
+ return lexer->offset == lexer->src_len;
+}
+
/**
* Compare two JSON paths using Lexer class.
* - in case of paths that have same token-sequence prefix,
@@ -279,6 +286,27 @@ json_token_is_leaf(struct json_token *token)
return token->max_child_idx < 0;
}
+/**
+ * Compare two JSON tokens, not taking into account their tree
+ * attributes.
+ */
+static inline int
+json_token_cmp(const struct json_token *l, const struct json_token *r)
+{
+ if (l->type != r->type)
+ return l->type - r->type;
+ switch(l->type) {
+ case JSON_TOKEN_NUM:
+ return l->num - r->num;
+ case JSON_TOKEN_STR:
+ if (l->len != r->len)
+ return l->len - r->len;
+ return memcmp(l->str, r->str, l->len);
+ default:
+ return 0;
+ }
+}
+
/**
* Test if a given JSON token is multikey.
*/
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v2 4/8] tuple: account the whole array in field.data and size
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
` (2 preceding siblings ...)
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 3/8] json: lexer_eof and token_cmp helper functions Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 5/8] tuple: enable JSON bar updates Vladislav Shpilevoy
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Before the patch a struct field object, describing an array,
didn't account array header in its size and data. Indeed, it
was not needed, because anyway updates could be only 'flat',
and the header was regenerated anyway.
But after next patches an array update may be not on the top
level of an update tree. And its parent node with the current way
of accounting would not be able to see exact borders of each
child.
Part of #1261
---
src/box/tuple_update.c | 28 ++++++++++++++++------------
src/box/update/update_array.c | 9 +++++----
src/box/update/update_field.h | 6 ++++--
3 files changed, 25 insertions(+), 18 deletions(-)
diff --git a/src/box/tuple_update.c b/src/box/tuple_update.c
index 87946919d..81e1f7e97 100644
--- a/src/box/tuple_update.c
+++ b/src/box/tuple_update.c
@@ -193,11 +193,12 @@ update_read_ops(struct tuple_update *update, const char *expr,
* @retval -1 Error.
*/
static int
-update_do_ops(struct tuple_update *update, const char *old_data,
- const char *old_data_end, uint32_t part_count)
+update_do_ops(struct tuple_update *update, const char *header,
+ const char *old_data, const char *old_data_end,
+ uint32_t part_count)
{
- if (update_array_create(&update->root_array, old_data, old_data_end,
- part_count) != 0)
+ if (update_array_create(&update->root_array, header, old_data,
+ old_data_end, part_count) != 0)
return -1;
struct update_op *op = update->ops;
struct update_op *ops_end = op + update->op_count;
@@ -214,12 +215,12 @@ update_do_ops(struct tuple_update *update, const char *old_data,
* and it is enough to simply write the error to the log.
*/
static int
-upsert_do_ops(struct tuple_update *update, const char *old_data,
- const char *old_data_end, uint32_t part_count,
- bool suppress_error)
+upsert_do_ops(struct tuple_update *update, const char *header,
+ const char *old_data, const char *old_data_end,
+ uint32_t part_count, bool suppress_error)
{
- if (update_array_create(&update->root_array, old_data, old_data_end,
- part_count) != 0)
+ if (update_array_create(&update->root_array, header, old_data,
+ old_data_end, part_count) != 0)
return -1;
struct update_op *op = update->ops;
struct update_op *ops_end = op + update->op_count;
@@ -276,11 +277,13 @@ tuple_update_execute(const char *expr,const char *expr_end,
{
struct tuple_update update;
update_init(&update, index_base);
+ const char *header = old_data;
uint32_t field_count = mp_decode_array(&old_data);
if (update_read_ops(&update, expr, expr_end, dict, field_count) != 0)
return NULL;
- if (update_do_ops(&update, old_data, old_data_end, field_count))
+ if (update_do_ops(&update, header, old_data, old_data_end,
+ field_count) != 0)
return NULL;
if (column_mask)
*column_mask = update.column_mask;
@@ -296,12 +299,13 @@ tuple_upsert_execute(const char *expr,const char *expr_end,
{
struct tuple_update update;
update_init(&update, index_base);
+ const char *header = old_data;
uint32_t field_count = mp_decode_array(&old_data);
if (update_read_ops(&update, expr, expr_end, dict, field_count) != 0)
return NULL;
- if (upsert_do_ops(&update, old_data, old_data_end, field_count,
- suppress_error))
+ if (upsert_do_ops(&update, header, old_data, old_data_end, field_count,
+ suppress_error) != 0)
return NULL;
if (column_mask)
*column_mask = update.column_mask;
diff --git a/src/box/update/update_array.c b/src/box/update/update_array.c
index 985d555af..5b834b644 100644
--- a/src/box/update/update_array.c
+++ b/src/box/update/update_array.c
@@ -143,12 +143,13 @@ update_array_extract_item(struct update_field *field, struct update_op *op)
}
int
-update_array_create(struct update_field *field, const char *data,
- const char *data_end, uint32_t field_count)
+update_array_create(struct update_field *field, const char *header,
+ const char *data, const char *data_end,
+ uint32_t field_count)
{
field->type = UPDATE_ARRAY;
- field->data = data;
- field->size = data_end - data;
+ field->data = header;
+ field->size = data_end - header;
field->array.rope = rope_new(&fiber()->gc);
if (field->array.rope == NULL)
return -1;
diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h
index 87f1d0b74..8ce3c3e82 100644
--- a/src/box/update/update_field.h
+++ b/src/box/update/update_field.h
@@ -306,6 +306,7 @@ update_##type##_store(struct update_field *field, char *out, char *out_end);
/**
* Initialize @a field as an array to update.
* @param[out] field Field to initialize.
+ * @param header Header of the MessagePack array @a data.
* @param data MessagePack data of the array to update.
* @param data_end End of @a data.
* @param field_count Field count in @data.
@@ -314,8 +315,9 @@ update_##type##_store(struct update_field *field, char *out, char *out_end);
* @retval -1 Error.
*/
int
-update_array_create(struct update_field *field, const char *data,
- const char *data_end, uint32_t field_count);
+update_array_create(struct update_field *field, const char *header,
+ const char *data, const char *data_end,
+ uint32_t field_count);
OP_DECL_GENERIC(array)
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v2 5/8] tuple: enable JSON bar updates
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
` (3 preceding siblings ...)
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 4/8] tuple: account the whole array in field.data and size Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 6/8] tuple: make update operation tokens consumable Vladislav Shpilevoy
` (2 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Bar update is an update by JSON path of a tuple's internal object
with no other updates along the path. It is an optimization to do
not store each part of the path and link to it in memory. Bar
stores just string with JSON and pointer to the MessagePack
object to update.
So far bar can not have another updates along the path, this
patch introduces only non-intersected JSON path updates (that is
actually one of the most common cases, so bar is really useful
optimization).
An example. There is an update {'=', 'a.b.c[1][2][3]', 100}.
For this whole update only one object will be created:
{
op = '=',
path = 'a.b.c[1][2][3]',
new_value = 100,
ptr_to_tuple_msgpack = 0x....,
...
}
The elements 'a', 'b', 'c', '[1]', '[2]', '[3]' are not a list
of objects each occupying memory. The whole bar is stored as one
object, and the path as a pointer to a string in MessagePack
obtained from a user.
This makes JSON updates memory complexity not depend on path
legths.
Part of #1261
---
src/box/CMakeLists.txt | 1 +
src/box/tuple_update.c | 35 ++-
src/box/update/update_array.c | 24 +-
src/box/update/update_bar.c | 408 +++++++++++++++++++++++++++++++++
src/box/update/update_field.c | 44 +++-
src/box/update/update_field.h | 102 +++++++++
src/box/vinyl.c | 17 +-
test/box/update.result | 410 +++++++++++++++++++++++++++++++++-
test/box/update.test.lua | 145 ++++++++++++
test/engine/update.result | 5 -
test/engine/update.test.lua | 2 -
test/unit/column_mask.c | 75 ++++++-
test/unit/column_mask.result | 8 +-
13 files changed, 1247 insertions(+), 29 deletions(-)
create mode 100644 src/box/update/update_bar.c
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 62737aafe..2888f4d7d 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -43,6 +43,7 @@ add_library(tuple STATIC
tuple_update.c
update/update_field.c
update/update_array.c
+ update/update_bar.c
tuple_compare.cc
tuple_extract_key.cc
tuple_hash.cc
diff --git a/src/box/tuple_update.c b/src/box/tuple_update.c
index 81e1f7e97..a9e4ed615 100644
--- a/src/box/tuple_update.c
+++ b/src/box/tuple_update.c
@@ -47,7 +47,12 @@ struct tuple_update {
* is from Lua, then the base is 1. Otherwise 0.
*/
int index_base;
- /** A bitmask of all columns modified by this update. */
+ /**
+ * A bitmask of all columns modified by this update. Only
+ * the first level of a tuple is accounted here. I.e. if
+ * a field [1][2][3] was updated, then only [1] is
+ * reflected.
+ */
uint64_t column_mask;
/** First level of update tree. It is always array. */
struct update_field root_array;
@@ -106,9 +111,25 @@ update_read_ops(struct tuple_update *update, const char *expr,
*/
if (column_mask != COLUMN_MASK_FULL) {
int32_t field_no;
+ char opcode;
+ if (update_op_is_term(op)) {
+ opcode = op->opcode;
+ } else {
+ /*
+ * When a field is not terminal,
+ * on the first level it is for
+ * sure changes only one field and
+ * in terms of column mask is
+ * equivalent to any scalar
+ * operation. Even if it was '!'
+ * or '#'.
+ */
+ opcode = 0;
+ }
+
if (op->field_no >= 0)
field_no = op->field_no;
- else if (op->opcode != '!')
+ else if (opcode != '!')
field_no = field_count_hint + op->field_no;
else
/*
@@ -151,12 +172,12 @@ update_read_ops(struct tuple_update *update, const char *expr,
* hint. It is used to translate negative
* field numbers into positive ones.
*/
- if (op->opcode == '!')
+ if (opcode == '!')
++field_count_hint;
- else if (op->opcode == '#')
+ else if (opcode == '#')
field_count_hint -= (int32_t) op->arg.del.count;
- if (op->opcode == '!' || op->opcode == '#')
+ if (opcode == '!' || opcode == '#')
/*
* If the operation is insertion
* or deletion then it potentially
@@ -331,8 +352,8 @@ tuple_upsert_squash(const char *expr1, const char *expr1_end,
int32_t prev_field_no = index_base - 1;
for (uint32_t i = 0; i < update[j].op_count; i++) {
struct update_op *op = &update[j].ops[i];
- if (op->opcode != '+' && op->opcode != '-' &&
- op->opcode != '=')
+ if ((op->opcode != '+' && op->opcode != '-' &&
+ op->opcode != '=') || op->lexer.src != NULL)
return NULL;
if (op->field_no <= prev_field_no)
return NULL;
diff --git a/src/box/update/update_array.c b/src/box/update/update_array.c
index 5b834b644..fe50a605a 100644
--- a/src/box/update/update_array.c
+++ b/src/box/update/update_array.c
@@ -216,12 +216,19 @@ do_op_array_insert(struct update_op *op, struct update_field *field)
{
assert(field->type == UPDATE_ARRAY);
struct rope *rope = field->array.rope;
+ struct update_array_item *item;
+ if (! update_op_is_term(op)) {
+ item = update_array_extract_item(field, op);
+ if (item == NULL)
+ return -1;
+ return do_op_insert(op, &item->field);
+ }
+
if (update_op_adjust_field_no(op, rope_size(rope) + 1) != 0)
return -1;
- struct update_array_item *item =
- (struct update_array_item *) rope_alloc(rope->ctx,
- sizeof(*item));
+ item = (struct update_array_item *) rope_alloc(rope->ctx,
+ sizeof(*item));
if (item == NULL)
return -1;
update_array_item_create(item, UPDATE_NOP, op->arg.set.value,
@@ -242,6 +249,8 @@ do_op_array_set(struct update_op *op, struct update_field *field)
update_array_extract_item(field, op);
if (item == NULL)
return -1;
+ if (! update_op_is_term(op))
+ return do_op_set(op, &item->field);
op->new_field_len = op->arg.set.length;
/* Ignore the previous op, if any. */
item->field.type = UPDATE_SCALAR;
@@ -253,6 +262,13 @@ int
do_op_array_delete(struct update_op *op, struct update_field *field)
{
assert(field->type == UPDATE_ARRAY);
+ if (! update_op_is_term(op)) {
+ struct update_array_item *item =
+ update_array_extract_item(field, op);
+ if (item == NULL)
+ return -1;
+ return do_op_delete(op, &item->field);
+ }
struct rope *rope = field->array.rope;
uint32_t size = rope_size(rope);
if (update_op_adjust_field_no(op, size) != 0)
@@ -273,6 +289,8 @@ do_op_array_##op_type(struct update_op *op, struct update_field *field) \
update_array_extract_item(field, op); \
if (item == NULL) \
return -1; \
+ if (! update_op_is_term(op)) \
+ return do_op_##op_type(op, &item->field); \
if (item->field.type != UPDATE_NOP) \
return update_err_double(op); \
if (update_op_do_##op_type(op, item->field.data) != 0) \
diff --git a/src/box/update/update_bar.c b/src/box/update/update_bar.c
new file mode 100644
index 000000000..f4fc00716
--- /dev/null
+++ b/src/box/update/update_bar.c
@@ -0,0 +1,408 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "update_field.h"
+#include "box/tuple.h"
+
+/**
+ * Locate the field to update by JSON path in @a op->path. If
+ * found, initialize @a field as a bar update.
+ * @param op Update operation.
+ * @param field Field to locate in.
+ *
+ * @retval 0 Success.
+ * @retval -1 Not found or invalid JSON.
+ */
+static inline int
+update_bar_locate(struct update_op *op, struct update_field *field)
+{
+ assert(! update_op_is_term(op));
+ const char *parent = NULL, *pos = field->data;
+ field->bar.path = op->lexer.src + op->lexer.offset;
+ field->bar.path_len = op->lexer.src_len - op->lexer.offset;
+ int rc;
+ struct json_token token;
+ while ((rc = json_lexer_next_token(&op->lexer, &token)) == 0 &&
+ token.type != JSON_TOKEN_END) {
+
+ parent = pos;
+ switch (token.type) {
+ case JSON_TOKEN_NUM:
+ rc = tuple_field_go_to_index(&pos, token.num);
+ break;
+ case JSON_TOKEN_STR:
+ rc = tuple_field_go_to_key(&pos, token.str, token.len);
+ break;
+ default:
+ assert(token.type == JSON_TOKEN_ANY);
+ return update_err_bad_json(op,
+ op->lexer.symbol_count - 1);
+ }
+ if (rc != 0)
+ return update_err_no_such_field(op);
+ }
+ if (rc > 0)
+ return update_err_bad_json(op, rc);
+
+ field->type = UPDATE_BAR;
+ field->bar.point = pos;
+ mp_next(&pos);
+ field->bar.point_size = pos - field->bar.point;
+ field->bar.op = op;
+ field->bar.parent = parent;
+ return 0;
+}
+
+/**
+ * Locate the optional field to set by JSON path in @a op->path.
+ * If found or only a last path part is not found, initialize @a
+ * field.
+ * @param op Update operation.
+ * @param field Field to locate in.
+ * @param[out] is_found Set if the field was found.
+ * @param[out] key_len_or_index One parameter for two values,
+ * depending on where the target point is located: in an
+ * array or a map. In case of map it is size of a key
+ * before the found point. It is used to find range of the
+ * both key and value in '#' operation to drop the pair.
+ * In case of array it is index of the point to be able to
+ * check how many fields are left for deletion.
+ *
+ * @retval 0 Success.
+ * @retval -1 Not found non-last path part or invalid JSON.
+ */
+static inline int
+update_bar_locate_opt(struct update_op *op, struct update_field *field,
+ bool *is_found, int *key_len_or_index)
+{
+ assert(! update_op_is_term(op));
+ int rc;
+ field->type = UPDATE_BAR;
+ field->bar.op = op;
+ field->bar.path = op->lexer.src + op->lexer.offset;
+ field->bar.path_len = op->lexer.src_len - op->lexer.offset;
+ const char *pos = field->data;
+ struct json_token token;
+ do {
+ rc = json_lexer_next_token(&op->lexer, &token);
+ if (rc != 0)
+ return update_err_bad_json(op, rc);
+
+ switch (token.type) {
+ case JSON_TOKEN_END:
+ *is_found = true;
+ field->bar.point = pos;
+ mp_next(&pos);
+ field->bar.point_size = pos - field->bar.point;
+ return 0;
+ case JSON_TOKEN_NUM:
+ field->bar.parent = pos;
+ *key_len_or_index = token.num;
+ rc = tuple_field_go_to_index(&pos, token.num);
+ break;
+ case JSON_TOKEN_STR:
+ field->bar.parent = pos;
+ *key_len_or_index = token.len;
+ rc = tuple_field_go_to_key(&pos, token.str, token.len);
+ break;
+ default:
+ assert(token.type == JSON_TOKEN_ANY);
+ return update_err_bad_json(op,
+ op->lexer.symbol_count - 1);
+ }
+ } while (rc == 0);
+ assert(rc == -1);
+ struct json_token tmp_token;
+ rc = json_lexer_next_token(&op->lexer, &tmp_token);
+ if (rc != 0)
+ return update_err_bad_json(op, rc);
+ if (tmp_token.type != JSON_TOKEN_END)
+ return update_err_no_such_field(op);
+
+ *is_found = false;
+ if (token.type == JSON_TOKEN_NUM) {
+ if (mp_typeof(*field->bar.parent) != MP_ARRAY) {
+ return update_err(op, "can not access by index a "\
+ "non-array field");
+ }
+ const char *tmp = field->bar.parent;
+ uint32_t size = mp_decode_array(&tmp);
+ if ((uint32_t) token.num > size)
+ return update_err_no_such_field(op);
+ /*
+ * The only way not to find an element in an array
+ * by an index is to use array size as the index.
+ */
+ assert((uint32_t) token.num == size);
+ if (field->bar.parent == field->data) {
+ field->bar.point = field->data + field->size;
+ } else {
+ field->bar.point = field->bar.parent;
+ mp_next(&field->bar.point);
+ }
+ } else {
+ assert(token.type == JSON_TOKEN_STR);
+ field->bar.new_key = token.str;
+ field->bar.new_key_len = token.len;
+ if (mp_typeof(*field->bar.parent) != MP_MAP) {
+ return update_err(op, "can not access by key a "\
+ "non-map field");
+ }
+ }
+ return 0;
+}
+
+int
+do_op_nop_insert(struct update_op *op, struct update_field *field)
+{
+ assert(op->opcode == '!');
+ assert(field->type == UPDATE_NOP);
+ bool is_found = false;
+ int key_len = 0;
+ if (update_bar_locate_opt(op, field, &is_found, &key_len) != 0)
+ return -1;
+ op->new_field_len = op->arg.set.length;
+ if (mp_typeof(*field->bar.parent) == MP_MAP) {
+ if (is_found)
+ return update_err_duplicate(op);
+ op->new_field_len += mp_sizeof_str(key_len);
+ }
+ return 0;
+}
+
+int
+do_op_nop_set(struct update_op *op, struct update_field *field)
+{
+ assert(op->opcode == '=');
+ assert(field->type == UPDATE_NOP);
+ bool is_found = false;
+ int key_len = 0;
+ if (update_bar_locate_opt(op, field, &is_found, &key_len) != 0)
+ return -1;
+ op->new_field_len = op->arg.set.length;
+ if (! is_found) {
+ op->opcode = '!';
+ if (mp_typeof(*field->bar.parent) == MP_MAP)
+ op->new_field_len += mp_sizeof_str(key_len);
+ }
+ return 0;
+}
+
+int
+do_op_nop_delete(struct update_op *op, struct update_field *field)
+{
+ assert(op->opcode == '#');
+ assert(field->type == UPDATE_NOP);
+ bool is_found = false;
+ int key_len_or_index = 0;
+ if (update_bar_locate_opt(op, field, &is_found, &key_len_or_index) != 0)
+ return -1;
+ if (! is_found)
+ return update_err_no_such_field(op);
+ if (mp_typeof(*field->bar.parent) == MP_ARRAY) {
+ const char *tmp = field->bar.parent;
+ uint32_t size = mp_decode_array(&tmp);
+ if (key_len_or_index + op->arg.del.count > size)
+ op->arg.del.count = size - key_len_or_index;
+ const char *end = field->bar.point + field->bar.point_size;
+ for (uint32_t i = 1; i < op->arg.del.count; ++i)
+ mp_next(&end);
+ field->bar.point_size = end - field->bar.point;
+ } else {
+ if (op->arg.del.count != 1)
+ return update_err_delete1(op);
+ /* Take key size into account to delete it too. */
+ uint32_t key_size = mp_sizeof_str(key_len_or_index);
+ field->bar.point -= key_size;
+ field->bar.point_size += key_size;
+ }
+ return 0;
+}
+
+#define DO_SCALAR_OP_GENERIC(op_type) \
+int \
+do_op_bar_##op_type(struct update_op *op, struct update_field *field) \
+{ \
+ (void) op; \
+ (void) field; \
+ assert(field->type == UPDATE_BAR); \
+ diag_set(ClientError, ER_UNSUPPORTED, "update", \
+ "intersected JSON paths"); \
+ return -1; \
+}
+
+DO_SCALAR_OP_GENERIC(insert)
+
+DO_SCALAR_OP_GENERIC(set)
+
+DO_SCALAR_OP_GENERIC(delete)
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
+
+#undef DO_SCALAR_OP_GENERIC
+
+#define DO_SCALAR_OP_GENERIC(op_type) \
+int \
+do_op_nop_##op_type(struct update_op *op, struct update_field *field) \
+{ \
+ assert(field->type == UPDATE_NOP); \
+ if (update_bar_locate(op, field) != 0) \
+ return -1; \
+ return update_op_do_##op_type(op, field->bar.point); \
+}
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
+
+uint32_t
+update_bar_sizeof(struct update_field *field)
+{
+ assert(field->type == UPDATE_BAR);
+ switch(field->bar.op->opcode) {
+ case '!': {
+ const char *parent = field->bar.parent;
+ uint32_t size = field->size + field->bar.op->new_field_len;
+ if (mp_typeof(*parent) == MP_ARRAY) {
+ uint32_t array_size = mp_decode_array(&parent);
+ return size + mp_sizeof_array(array_size + 1) -
+ mp_sizeof_array(array_size);
+ } else {
+ uint32_t map_size = mp_decode_map(&parent);
+ return size + mp_sizeof_map(map_size + 1) -
+ mp_sizeof_map(map_size);
+ }
+ }
+ case '#': {
+ const char *parent = field->bar.parent;
+ uint32_t delete_count = field->bar.op->arg.del.count;
+ uint32_t size = field->size - field->bar.point_size;
+ if (mp_typeof(*parent) == MP_ARRAY) {
+ uint32_t array_size = mp_decode_array(&parent);
+ assert(array_size >= delete_count);
+ return size - mp_sizeof_array(array_size) +
+ mp_sizeof_array(array_size - delete_count);
+ } else {
+ uint32_t map_size = mp_decode_map(&parent);
+ assert(delete_count == 1);
+ return size - mp_sizeof_map(map_size) +
+ mp_sizeof_map(map_size - 1);
+ }
+ }
+ default: {
+ return field->size - field->bar.point_size +
+ field->bar.op->new_field_len;
+ }
+ }
+}
+
+uint32_t
+update_bar_store(struct update_field *field, char *out, char *out_end)
+{
+ assert(field->type == UPDATE_BAR);
+ (void) out_end;
+ struct update_op *op = field->bar.op;
+ char *out_saved = out;
+ switch(op->opcode) {
+ case '!': {
+ const char *pos = field->bar.parent;
+ uint32_t before_parent = pos - field->data;
+ /* Before parent. */
+ memcpy(out, field->data, before_parent);
+ out += before_parent;
+ if (mp_typeof(*pos) == MP_ARRAY) {
+ /* New array header. */
+ uint32_t size = mp_decode_array(&pos);
+ out = mp_encode_array(out, size + 1);
+ /* Before insertion point. */
+ size = field->bar.point - pos;
+ memcpy(out, pos, size);
+ out += size;
+ pos += size;
+ } else {
+ /* New map header. */
+ uint32_t size = mp_decode_map(&pos);
+ out = mp_encode_map(out, size + 1);
+ /* New key. */
+ out = mp_encode_str(out, field->bar.new_key,
+ field->bar.new_key_len);
+ }
+ /* New value. */
+ memcpy(out, op->arg.set.value, op->arg.set.length);
+ out += op->arg.set.length;
+ /* Old values and field tail. */
+ uint32_t after_point = field->data + field->size - pos;
+ memcpy(out, pos, after_point);
+ out += after_point;
+ return out - out_saved;
+ }
+ case '#': {
+ const char *pos = field->bar.parent;
+ uint32_t size, before_parent = pos - field->data;
+ memcpy(out, field->data, before_parent);
+ out += before_parent;
+ if (mp_typeof(*pos) == MP_ARRAY) {
+ size = mp_decode_array(&pos);
+ out = mp_encode_array(out, size - op->arg.del.count);
+ } else {
+ size = mp_decode_map(&pos);
+ out = mp_encode_map(out, size - 1);
+ }
+ size = field->bar.point - pos;
+ memcpy(out, pos, size);
+ out += size;
+ pos = field->bar.point + field->bar.point_size;
+
+ size = field->data + field->size - pos;
+ memcpy(out, pos, size);
+ return out + size - out_saved;
+ }
+ default: {
+ uint32_t before_point = field->bar.point - field->data;
+ const char *field_end = field->data + field->size;
+ const char *point_end =
+ field->bar.point + field->bar.point_size;
+ uint32_t after_point = field_end - point_end;
+
+ memcpy(out, field->data, before_point);
+ out += before_point;
+ op->meta->store_cb(op, field->bar.point, out);
+ out += op->new_field_len;
+ memcpy(out, point_end, after_point);
+ return out + after_point - out_saved;
+ }
+ }
+}
diff --git a/src/box/update/update_field.c b/src/box/update/update_field.c
index b4ede54db..6baad02dd 100644
--- a/src/box/update/update_field.c
+++ b/src/box/update/update_field.c
@@ -42,7 +42,9 @@
static inline const char *
update_op_field_str(const struct update_op *op)
{
- if (op->field_no >= 0)
+ if (op->lexer.src != NULL)
+ return tt_sprintf("'%.*s'", op->lexer.src_len, op->lexer.src);
+ else if (op->field_no >= 0)
return tt_sprintf("%d", op->field_no + TUPLE_INDEX_BASE);
else
return tt_sprintf("%d", op->field_no);
@@ -83,8 +85,12 @@ update_err_splice_bound(const struct update_op *op)
int
update_err_no_such_field(const struct update_op *op)
{
- diag_set(ClientError, ER_NO_SUCH_FIELD_NO, op->field_no >= 0 ?
- TUPLE_INDEX_BASE + op->field_no : op->field_no);
+ if (op->lexer.src == NULL) {
+ diag_set(ClientError, ER_NO_SUCH_FIELD_NO, op->field_no +
+ (op->field_no >= 0 ? TUPLE_INDEX_BASE : 0));
+ return -1;
+ }
+ diag_set(ClientError, ER_NO_SUCH_FIELD_NAME, update_op_field_str(op));
return -1;
}
@@ -108,6 +114,8 @@ update_field_sizeof(struct update_field *field)
return field->scalar.op->new_field_len;
case UPDATE_ARRAY:
return update_array_sizeof(field);
+ case UPDATE_BAR:
+ return update_bar_sizeof(field);
default:
unreachable();
}
@@ -132,6 +140,8 @@ update_field_store(struct update_field *field, char *out, char *out_end)
return size;
case UPDATE_ARRAY:
return update_array_store(field, out, out_end);
+ case UPDATE_BAR:
+ return update_bar_store(field, out, out_end);
default:
unreachable();
}
@@ -616,6 +626,7 @@ update_op_decode(struct update_op *op, int index_base,
switch(mp_typeof(**expr)) {
case MP_INT:
case MP_UINT: {
+ json_lexer_create(&op->lexer, NULL, 0, 0);
if (mp_read_i32(op, expr, &field_no) != 0)
return -1;
if (field_no - index_base >= 0) {
@@ -631,14 +642,35 @@ update_op_decode(struct update_op *op, int index_base,
case MP_STR: {
const char *path = mp_decode_str(expr, &len);
uint32_t field_no, hash = field_name_hash(path, len);
+ json_lexer_create(&op->lexer, path, len, TUPLE_INDEX_BASE);
if (tuple_fieldno_by_name(dict, path, len, hash,
&field_no) == 0) {
op->field_no = (int32_t) field_no;
+ op->lexer.offset = len;
break;
}
- diag_set(ClientError, ER_NO_SUCH_FIELD_NAME,
- tt_cstr(path, len));
- return -1;
+ struct json_token token;
+ int rc = json_lexer_next_token(&op->lexer, &token);
+ if (rc != 0)
+ return update_err_bad_json(op, rc);
+ switch (token.type) {
+ case JSON_TOKEN_NUM:
+ op->field_no = token.num;
+ break;
+ case JSON_TOKEN_STR:
+ hash = field_name_hash(token.str, token.len);
+ if (tuple_fieldno_by_name(dict, token.str, token.len,
+ hash, &field_no) == 0) {
+ op->field_no = (int32_t) field_no;
+ break;
+ }
+ FALLTHROUGH;
+ default:
+ diag_set(ClientError, ER_NO_SUCH_FIELD_NAME,
+ tt_cstr(path, len));
+ return -1;
+ }
+ break;
}
default:
diag_set(ClientError, ER_ILLEGAL_PARAMS,
diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h
index 8ce3c3e82..d4499eff8 100644
--- a/src/box/update/update_field.h
+++ b/src/box/update/update_field.h
@@ -33,6 +33,7 @@
#include "trivia/util.h"
#include "tt_static.h"
#include <stdint.h>
+#include "json/json.h"
#include "bit/int96.h"
#include "mp_decimal.h"
@@ -180,6 +181,12 @@ struct update_op {
uint32_t new_field_len;
/** Opcode symbol: = + - / ... */
char opcode;
+ /**
+ * Operation target path and its lexer in one. This lexer
+ * is used when the operation is applied down through the
+ * update tree.
+ */
+ struct json_lexer lexer;
};
/**
@@ -196,6 +203,16 @@ int
update_op_decode(struct update_op *op, int index_base,
struct tuple_dictionary *dict, const char **expr);
+/**
+ * Check if the operation should be applied on the current path
+ * node.
+ */
+static inline bool
+update_op_is_term(const struct update_op *op)
+{
+ return json_lexer_is_eof(&op->lexer);
+}
+
/* }}} update_op */
/* {{{ update_field */
@@ -220,6 +237,14 @@ enum update_type {
* of individual fields.
*/
UPDATE_ARRAY,
+ /**
+ * Field of this type stores such update, that has
+ * non-empty JSON path non-intersected with any another
+ * update. In such optimized case it is possible to do not
+ * allocate neither fields nor ops nor anything for path
+ * nodes. And this is the most common case.
+ */
+ UPDATE_BAR,
};
/**
@@ -254,6 +279,49 @@ struct update_field {
struct {
struct rope *rope;
} array;
+ /**
+ * Bar update - by JSON path non-intersected with
+ * any another update.
+ */
+ struct {
+ /** Bar update is a single operation. */
+ struct update_op *op;
+ /**
+ * Always has a non-empty head path
+ * leading inside this field's data.
+ */
+ const char *path;
+ int path_len;
+ /**
+ * For insertion/deletion to change parent
+ * header.
+ */
+ const char *parent;
+ union {
+ /**
+ * For scalar op; insertion into
+ * array; deletion. This is the
+ * point to delete, change or
+ * insert after.
+ */
+ struct {
+ const char *point;
+ uint32_t point_size;
+ };
+ /*
+ * For insertion into map. New
+ * key. On insertion into a map
+ * there is no strict order as in
+ * array and no point. The field
+ * is inserted just right after
+ * the parent header.
+ */
+ struct {
+ const char *new_key;
+ uint32_t new_key_len;
+ };
+ };
+ } bar;
};
};
@@ -323,6 +391,18 @@ OP_DECL_GENERIC(array)
/* }}} update_field.array */
+/* {{{ update_field.bar */
+
+OP_DECL_GENERIC(bar)
+
+/* }}} update_field.bar */
+
+/* {{{ update_field.nop */
+
+OP_DECL_GENERIC(nop)
+
+/* }}} update_field.nop */
+
#undef OP_DECL_GENERIC
/* {{{ Common helpers. */
@@ -342,6 +422,10 @@ do_op_##op_type(struct update_op *op, struct update_field *field) \
switch (field->type) { \
case UPDATE_ARRAY: \
return do_op_array_##op_type(op, field); \
+ case UPDATE_NOP: \
+ return do_op_nop_##op_type(op, field); \
+ case UPDATE_BAR: \
+ return do_op_bar_##op_type(op, field); \
default: \
unreachable(); \
} \
@@ -407,6 +491,24 @@ update_err_double(const struct update_op *op)
return update_err(op, "double update of the same field");
}
+static inline int
+update_err_bad_json(const struct update_op *op, int pos)
+{
+ return update_err(op, tt_sprintf("invalid JSON in position %d", pos));
+}
+
+static inline int
+update_err_delete1(const struct update_op *op)
+{
+ return update_err(op, "can delete only 1 field from a map in a row");
+}
+
+static inline int
+update_err_duplicate(const struct update_op *op)
+{
+ return update_err(op, "the key exists already");
+}
+
/** }}} Error helpers. */
#endif /* TARANTOOL_BOX_TUPLE_UPDATE_FIELD_H */
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 7455c2c86..7fade332b 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -1997,15 +1997,28 @@ request_normalize_ops(struct request *request)
ops_end = mp_encode_str(ops_end, op_name, op_name_len);
int field_no;
- if (mp_typeof(*pos) == MP_INT) {
+ const char *field_name;
+ switch (mp_typeof(*pos)) {
+ case MP_INT:
field_no = mp_decode_int(&pos);
ops_end = mp_encode_int(ops_end, field_no);
- } else {
+ break;
+ case MP_UINT:
field_no = mp_decode_uint(&pos);
field_no -= request->index_base;
ops_end = mp_encode_uint(ops_end, field_no);
+ break;
+ case MP_STR:
+ field_name = pos;
+ mp_next(&pos);
+ memcpy(ops_end, field_name, pos - field_name);
+ ops_end += pos - field_name;
+ break;
+ default:
+ unreachable();
}
+
if (*op_name == ':') {
/**
* splice op adjust string pos and copy
diff --git a/test/box/update.result b/test/box/update.result
index 6c7bf09df..38b8bc13e 100644
--- a/test/box/update.result
+++ b/test/box/update.result
@@ -834,7 +834,7 @@ s:update({0}, {{'+', 0}})
...
s:update({0}, {{'+', '+', '+'}})
---
-- error: Field '+' was not found in the tuple
+- error: 'Field ''+'' UPDATE error: invalid JSON in position 1'
...
s:update({0}, {{0, 0, 0}})
---
@@ -889,3 +889,411 @@ s:update(1, {{'=', 3, map}})
s:drop()
---
...
+--
+-- gh-1261: update by JSON path.
+--
+format = {}
+---
+...
+format[1] = {'field1', 'unsigned'}
+---
+...
+format[2] = {'f', 'map'}
+---
+...
+format[3] = {'g', 'array'}
+---
+...
+s = box.schema.create_space('test', {format = format})
+---
+...
+pk = s:create_index('pk')
+---
+...
+t = {}
+---
+...
+t[1] = 1
+---
+...
+t[2] = { \
+ a = 100, \
+ b = 200, \
+ c = { \
+ d = 400, \
+ e = 500, \
+ f = {4, 5, 6, 7, 8}, \
+ g = {k = 600, l = 700} \
+ }, \
+ m = true, \
+ g = {800, 900} \
+}; \
+t[3] = { \
+ 100, \
+ 200, \
+ { \
+ {300, 350}, \
+ {400, 450} \
+ }, \
+ {a = 500, b = 600}, \
+ {c = 700, d = 800} \
+}
+---
+...
+t = s:insert(t)
+---
+...
+t4_array = t:update({{'!', 4, setmetatable({}, {__serialize = 'array'})}})
+---
+...
+t4_map = t:update({{'!', 4, setmetatable({}, {__serialize = 'map'})}})
+---
+...
+t
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+--
+-- At first, test simple non-intersected paths.
+--
+--
+-- !
+--
+t:update({{'!', 'f.c.f[1]', 3}, {'!', '[3][1]', {100, 200, 300}}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [3, 4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [[100, 200, 300], 100, 200, [
+ [300, 350], [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'!', 'f.g[3]', 1000}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900, 1000]}, [100, 200, [[300, 350],
+ [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'!', 'g[6]', 'new element'}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}, 'new element']]
+...
+t:update({{'!', 'f.e', 300}, {'!', 'g[4].c', 700}})
+---
+- [1, {'b': 200, 'm': true, 'g': [800, 900], 'a': 100, 'c': {'d': 400, 'f': [4, 5,
+ 6, 7, 8], 'e': 500, 'g': {'k': 600, 'l': 700}}, 'e': 300}, [100, 200, [[300,
+ 350], [400, 450]], {'b': 600, 'c': 700, 'a': 500}, {'c': 700, 'd': 800}]]
+...
+t:update({{'!', 'f.c.f[2]', 4.5}, {'!', 'g[3][2][2]', 425}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 4.5, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 425, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t2 = t:update({{'!', 'g[6]', {100}}})
+---
+...
+-- Test single element array update.
+t2:update({{'!', 'g[6][2]', 200}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}, [100, 200]]]
+...
+t2:update({{'!', 'g[6][1]', 50}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}, [50, 100]]]
+...
+-- Test empty array/map.
+t4_array:update({{'!', '[4][1]', 100}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], [100]]
+...
+t4_map:update({{'!', '[4].a', 100}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], {'a': 100}]
+...
+-- Test errors.
+t:update({{'!', 'a', 100}}) -- No such field.
+---
+- error: Field 'a' was not found in the tuple
+...
+t:update({{'!', 'f.a', 300}}) -- Key already exists.
+---
+- error: 'Field ''f.a'' UPDATE error: the key exists already'
+...
+t:update({{'!', 'f.c.f[0]', 3.5}}) -- No such index, too small.
+---
+- error: 'Field ''f.c.f[0]'' UPDATE error: invalid JSON in position 7'
+...
+t:update({{'!', 'f.c.f[100]', 100}}) -- No such index, too big.
+---
+- error: Field ''f.c.f[100]'' was not found in the tuple
+...
+t:update({{'!', 'g[4][100]', 700}}) -- Insert index into map.
+---
+- error: 'Field ''g[4][100]'' UPDATE error: can not access by index a non-array field'
+...
+t:update({{'!', 'g[1][1]', 300}})
+---
+- error: 'Field ''g[1][1]'' UPDATE error: can not access by index a non-array field'
+...
+t:update({{'!', 'f.g.a', 700}}) -- Insert key into array.
+---
+- error: 'Field ''f.g.a'' UPDATE error: can not access by key a non-map field'
+...
+t:update({{'!', 'f.g[1].a', 700}})
+---
+- error: 'Field ''f.g[1].a'' UPDATE error: can not access by key a non-map field'
+...
+t:update({{'!', 'f[*].k', 20}}) -- 'Any' is not considered valid JSON.
+---
+- error: 'Field ''f[*].k'' UPDATE error: invalid JSON in position 3'
+...
+-- JSON error after the not existing field to insert.
+t:update({{'!', '[2].e.100000', 100}})
+---
+- error: 'Field ''[2].e.100000'' UPDATE error: invalid JSON in position 7'
+...
+-- Correct JSON, but next to last field does not exist. '!' can't
+-- create the whole path.
+t:update({{'!', '[2].e.f', 100}})
+---
+- error: Field ''[2].e.f'' was not found in the tuple
+...
+--
+-- =
+--
+-- Set existing fields.
+t:update({{'=', 'f.a', 150}, {'=', 'g[3][1][2]', 400}})
+---
+- [1, {'b': 200, 'm': true, 'a': 150, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 400], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'=', 'f', {a = 100, b = 200}}})
+---
+- [1, {'a': 100, 'b': 200}, [100, 200, [[300, 350], [400, 450]], {'a': 500, 'b': 600},
+ {'c': 700, 'd': 800}]]
+...
+t:update({{'=', 'g[4].b', 700}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 700}, {'c': 700, 'd': 800}]]
+...
+-- Insert via set.
+t:update({{'=', 'f.e', 300}})
+---
+- [1, {'b': 200, 'm': true, 'g': [800, 900], 'a': 100, 'c': {'d': 400, 'f': [4, 5,
+ 6, 7, 8], 'e': 500, 'g': {'k': 600, 'l': 700}}, 'e': 300}, [100, 200, [[300,
+ 350], [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'=', 'f.g[3]', 1000}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900, 1000]}, [100, 200, [[300, 350],
+ [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'=', 'f.g[1]', 0}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [0, 900]}, [100, 200, [[300, 350], [400, 450]],
+ {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+-- Test empty array/map.
+t4_array:update({{'=', '[4][1]', 100}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], [100]]
+...
+t4_map:update({{'=', '[4]["a"]', 100}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], {'a': 100}]
+...
+-- Test errors.
+t:update({{'=', 'f.a[1]', 100}})
+---
+- error: 'Field ''f.a[1]'' UPDATE error: can not access by index a non-array field'
+...
+t:update({{'=', 'f.a.k', 100}})
+---
+- error: 'Field ''f.a.k'' UPDATE error: can not access by key a non-map field'
+...
+t:update({{'=', 'f.c.f[1]', 100}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [100, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'=', 'f.c.f[100]', 100}})
+---
+- error: Field ''f.c.f[100]'' was not found in the tuple
+...
+t:update({{'=', '[2].c.f 1 1 1 1', 100}})
+---
+- error: 'Field ''[2].c.f 1 1 1 1'' UPDATE error: invalid JSON in position 8'
+...
+--
+-- #
+--
+t:update({{'#', '[2].b', 1}})
+---
+- [1, {'a': 100, 'm': true, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500, 'g': {
+ 'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, 450]],
+ {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'#', 'f.c.f[1]', 1}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'#', 'f.c.f[1]', 2}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [6, 7, 8], 'e': 500, 'g': {
+ 'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, 450]],
+ {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'#', 'f.c.f[1]', 100}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [], 'e': 500, 'g': {'k': 600,
+ 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400, 450]], {'a': 500,
+ 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'#', 'f.c.f[5]', 1}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'#', 'f.c.f[5]', 2}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+-- Test errors.
+t:update({{'#', 'f.h', 1}})
+---
+- error: Field ''f.h'' was not found in the tuple
+...
+t:update({{'#', 'f.c.f[100]', 1}})
+---
+- error: Field ''f.c.f[100]'' was not found in the tuple
+...
+t:update({{'#', 'f.b', 2}})
+---
+- error: 'Field ''f.b'' UPDATE error: can delete only 1 field from a map in a row'
+...
+t:update({{'#', 'f.b', 0}})
+---
+- error: 'Field ''f.b'' UPDATE error: cannot delete 0 fields'
+...
+t:update({{'#', 'f', 0}})
+---
+- error: 'Field ''f'' UPDATE error: cannot delete 0 fields'
+...
+--
+-- Scalar operations.
+--
+t:update({{'+', 'f.a', 50}})
+---
+- [1, {'b': 200, 'm': true, 'a': 150, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'-', 'f.c.f[1]', 0.5}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [3.5, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t:update({{'&', 'f.c.f[2]', 4}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 4, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t2 = t:update({{'=', 4, {str = 'abcd'}}})
+---
+...
+t2:update({{':', '[4].str', 2, 2, 'e'}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}], {'str': 'aed'}]
+...
+-- Test errors.
+t:update({{'+', 'g[3]', 50}})
+---
+- error: 'Argument type in operation ''+'' on field ''g[3]'' does not match field
+ type: expected a number'
+...
+t:update({{'+', '[2].b.......', 100}})
+---
+- error: 'Field ''[2].b.......'' UPDATE error: invalid JSON in position 7'
+...
+t:update({{'+', '[2].b.c.d.e', 100}})
+---
+- error: Field ''[2].b.c.d.e'' was not found in the tuple
+...
+t:update({{'-', '[2][*]', 20}})
+---
+- error: 'Field ''[2][*]'' UPDATE error: invalid JSON in position 5'
+...
+-- Vinyl normalizes field numbers. It should not touch paths,
+-- and they should not affect squashing.
+format = {}
+---
+...
+format[1] = {'field1', 'unsigned'}
+---
+...
+format[2] = {'field2', 'any'}
+---
+...
+vy_s = box.schema.create_space('test2', {engine = 'vinyl', format = format})
+---
+...
+pk = vy_s:create_index('pk')
+---
+...
+_ = vy_s:replace(t)
+---
+...
+box.begin()
+---
+...
+-- Use a scalar operation, only they can be squashed.
+vy_s:upsert({1, 1}, {{'+', 'field2.c.f[1]', 1}})
+---
+...
+vy_s:upsert({1, 1}, {{'+', '[3][3][1][1]', 1}})
+---
+...
+box.commit()
+---
+...
+vy_s:select()
+---
+- - [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [5, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[301, 350], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+vy_s:drop()
+---
+...
+s:drop()
+---
+...
diff --git a/test/box/update.test.lua b/test/box/update.test.lua
index ac7698ce9..60e669d27 100644
--- a/test/box/update.test.lua
+++ b/test/box/update.test.lua
@@ -280,3 +280,148 @@ t:update({{'=', 3, map}})
s:update(1, {{'=', 3, map}})
s:drop()
+
+--
+-- gh-1261: update by JSON path.
+--
+format = {}
+format[1] = {'field1', 'unsigned'}
+format[2] = {'f', 'map'}
+format[3] = {'g', 'array'}
+s = box.schema.create_space('test', {format = format})
+pk = s:create_index('pk')
+t = {}
+t[1] = 1
+t[2] = { \
+ a = 100, \
+ b = 200, \
+ c = { \
+ d = 400, \
+ e = 500, \
+ f = {4, 5, 6, 7, 8}, \
+ g = {k = 600, l = 700} \
+ }, \
+ m = true, \
+ g = {800, 900} \
+}; \
+t[3] = { \
+ 100, \
+ 200, \
+ { \
+ {300, 350}, \
+ {400, 450} \
+ }, \
+ {a = 500, b = 600}, \
+ {c = 700, d = 800} \
+}
+t = s:insert(t)
+
+t4_array = t:update({{'!', 4, setmetatable({}, {__serialize = 'array'})}})
+t4_map = t:update({{'!', 4, setmetatable({}, {__serialize = 'map'})}})
+
+t
+--
+-- At first, test simple non-intersected paths.
+--
+
+--
+-- !
+--
+t:update({{'!', 'f.c.f[1]', 3}, {'!', '[3][1]', {100, 200, 300}}})
+t:update({{'!', 'f.g[3]', 1000}})
+t:update({{'!', 'g[6]', 'new element'}})
+t:update({{'!', 'f.e', 300}, {'!', 'g[4].c', 700}})
+t:update({{'!', 'f.c.f[2]', 4.5}, {'!', 'g[3][2][2]', 425}})
+t2 = t:update({{'!', 'g[6]', {100}}})
+-- Test single element array update.
+t2:update({{'!', 'g[6][2]', 200}})
+t2:update({{'!', 'g[6][1]', 50}})
+-- Test empty array/map.
+t4_array:update({{'!', '[4][1]', 100}})
+t4_map:update({{'!', '[4].a', 100}})
+-- Test errors.
+t:update({{'!', 'a', 100}}) -- No such field.
+t:update({{'!', 'f.a', 300}}) -- Key already exists.
+t:update({{'!', 'f.c.f[0]', 3.5}}) -- No such index, too small.
+t:update({{'!', 'f.c.f[100]', 100}}) -- No such index, too big.
+t:update({{'!', 'g[4][100]', 700}}) -- Insert index into map.
+t:update({{'!', 'g[1][1]', 300}})
+t:update({{'!', 'f.g.a', 700}}) -- Insert key into array.
+t:update({{'!', 'f.g[1].a', 700}})
+t:update({{'!', 'f[*].k', 20}}) -- 'Any' is not considered valid JSON.
+-- JSON error after the not existing field to insert.
+t:update({{'!', '[2].e.100000', 100}})
+-- Correct JSON, but next to last field does not exist. '!' can't
+-- create the whole path.
+t:update({{'!', '[2].e.f', 100}})
+
+--
+-- =
+--
+-- Set existing fields.
+t:update({{'=', 'f.a', 150}, {'=', 'g[3][1][2]', 400}})
+t:update({{'=', 'f', {a = 100, b = 200}}})
+t:update({{'=', 'g[4].b', 700}})
+-- Insert via set.
+t:update({{'=', 'f.e', 300}})
+t:update({{'=', 'f.g[3]', 1000}})
+t:update({{'=', 'f.g[1]', 0}})
+-- Test empty array/map.
+t4_array:update({{'=', '[4][1]', 100}})
+t4_map:update({{'=', '[4]["a"]', 100}})
+-- Test errors.
+t:update({{'=', 'f.a[1]', 100}})
+t:update({{'=', 'f.a.k', 100}})
+t:update({{'=', 'f.c.f[1]', 100}})
+t:update({{'=', 'f.c.f[100]', 100}})
+t:update({{'=', '[2].c.f 1 1 1 1', 100}})
+
+--
+-- #
+--
+t:update({{'#', '[2].b', 1}})
+t:update({{'#', 'f.c.f[1]', 1}})
+t:update({{'#', 'f.c.f[1]', 2}})
+t:update({{'#', 'f.c.f[1]', 100}})
+t:update({{'#', 'f.c.f[5]', 1}})
+t:update({{'#', 'f.c.f[5]', 2}})
+-- Test errors.
+t:update({{'#', 'f.h', 1}})
+t:update({{'#', 'f.c.f[100]', 1}})
+t:update({{'#', 'f.b', 2}})
+t:update({{'#', 'f.b', 0}})
+t:update({{'#', 'f', 0}})
+
+--
+-- Scalar operations.
+--
+t:update({{'+', 'f.a', 50}})
+t:update({{'-', 'f.c.f[1]', 0.5}})
+t:update({{'&', 'f.c.f[2]', 4}})
+t2 = t:update({{'=', 4, {str = 'abcd'}}})
+t2:update({{':', '[4].str', 2, 2, 'e'}})
+-- Test errors.
+t:update({{'+', 'g[3]', 50}})
+t:update({{'+', '[2].b.......', 100}})
+t:update({{'+', '[2].b.c.d.e', 100}})
+t:update({{'-', '[2][*]', 20}})
+
+-- Vinyl normalizes field numbers. It should not touch paths,
+-- and they should not affect squashing.
+format = {}
+format[1] = {'field1', 'unsigned'}
+format[2] = {'field2', 'any'}
+vy_s = box.schema.create_space('test2', {engine = 'vinyl', format = format})
+pk = vy_s:create_index('pk')
+_ = vy_s:replace(t)
+
+box.begin()
+-- Use a scalar operation, only they can be squashed.
+vy_s:upsert({1, 1}, {{'+', 'field2.c.f[1]', 1}})
+vy_s:upsert({1, 1}, {{'+', '[3][3][1][1]', 1}})
+box.commit()
+
+vy_s:select()
+vy_s:drop()
+
+s:drop()
diff --git a/test/engine/update.result b/test/engine/update.result
index f181924f3..ddb13bd5b 100644
--- a/test/engine/update.result
+++ b/test/engine/update.result
@@ -843,11 +843,6 @@ t:update({{'+', '[1]', 50}})
---
- [1, [10, 11, 12], {'b': 21, 'a': 20, 'c': 22}, 'abcdefgh', true, -100, 250]
...
--- JSON paths are not allowed yet.
-t:update({{'=', 'field2[1]', 13}})
----
-- error: Field 'field2[1]' was not found in the tuple
-...
s:update({1}, {{'=', 'field3', {d = 30, e = 31, f = 32}}})
---
- [1, [10, 11, 12], {'d': 30, 'f': 32, 'e': 31}, 'abcdefgh', true, -100, 200]
diff --git a/test/engine/update.test.lua b/test/engine/update.test.lua
index 4ca2589e4..31fca2b7b 100644
--- a/test/engine/update.test.lua
+++ b/test/engine/update.test.lua
@@ -156,8 +156,6 @@ t:update({{':', 'field4', 3, 3, 'bbccdd'}, {'+', 'field6', 50}, {'!', 7, 300}})
-- Any path is interpreted as a field name first. And only then
-- as JSON.
t:update({{'+', '[1]', 50}})
--- JSON paths are not allowed yet.
-t:update({{'=', 'field2[1]', 13}})
s:update({1}, {{'=', 'field3', {d = 30, e = 31, f = 32}}})
diff --git a/test/unit/column_mask.c b/test/unit/column_mask.c
index 5ee8b7332..38ec34f8f 100644
--- a/test/unit/column_mask.c
+++ b/test/unit/column_mask.c
@@ -225,16 +225,87 @@ basic_test()
column_masks[i]);
}
+static void
+test_paths(void)
+{
+ header();
+ plan(2);
+
+ char buffer1[1024];
+ char *pos1 = mp_encode_array(buffer1, 7);
+
+ pos1 = mp_encode_uint(pos1, 1);
+ pos1 = mp_encode_uint(pos1, 2);
+ pos1 = mp_encode_array(pos1, 2);
+ pos1 = mp_encode_uint(pos1, 3);
+ pos1 = mp_encode_uint(pos1, 4);
+ pos1 = mp_encode_uint(pos1, 5);
+ pos1 = mp_encode_array(pos1, 2);
+ pos1 = mp_encode_uint(pos1, 6);
+ pos1 = mp_encode_uint(pos1, 7);
+ pos1 = mp_encode_uint(pos1, 8);
+ pos1 = mp_encode_uint(pos1, 9);
+
+
+ char buffer2[1024];
+ char *pos2 = mp_encode_array(buffer2, 2);
+
+ pos2 = mp_encode_array(pos2, 3);
+ pos2 = mp_encode_str(pos2, "!", 1);
+ pos2 = mp_encode_str(pos2, "[3][1]", 6);
+ pos2 = mp_encode_double(pos2, 2.5);
+
+ pos2 = mp_encode_array(pos2, 3);
+ pos2 = mp_encode_str(pos2, "#", 1);
+ pos2 = mp_encode_str(pos2, "[5][1]", 6);
+ pos2 = mp_encode_uint(pos2, 1);
+
+ struct region *gc = &fiber()->gc;
+ size_t svp = region_used(gc);
+ uint32_t result_size;
+ uint64_t column_mask;
+ const char *result =
+ tuple_update_execute(buffer2, pos2, buffer1, pos1,
+ box_tuple_format_default()->dict,
+ &result_size, 1, &column_mask);
+ isnt(result, NULL, "JSON update works");
+
+ /*
+ * Updates on their first level change fields [3] and [5],
+ * or 2 and 4 if 0-based. If that was the single level,
+ * the operations '!' and '#' would change the all the
+ * fields from 2. But each of these operations are not for
+ * the root and therefore does not affect anything except
+ * [3] and [5] on the first level.
+ */
+ uint64_t expected_mask = 0;
+ column_mask_set_fieldno(&expected_mask, 2);
+ column_mask_set_fieldno(&expected_mask, 4);
+ is(column_mask, expected_mask, "column mask match");
+
+ region_truncate(gc, svp);
+
+ check_plan();
+ footer();
+}
+
+static uint32_t
+simple_hash(const char* str, uint32_t len)
+{
+ return str[0] + len;
+}
+
int
main()
{
memory_init();
fiber_init(fiber_c_invoke);
- tuple_init(NULL);
+ tuple_init(simple_hash);
header();
- plan(27);
+ plan(28);
basic_test();
+ test_paths();
footer();
check_plan();
diff --git a/test/unit/column_mask.result b/test/unit/column_mask.result
index 9309e6cdc..1d87a2f24 100644
--- a/test/unit/column_mask.result
+++ b/test/unit/column_mask.result
@@ -1,5 +1,5 @@
*** main ***
-1..27
+1..28
ok 1 - check result length
ok 2 - tuple update is correct
ok 3 - column_mask is correct
@@ -27,4 +27,10 @@ ok 24 - column_mask is correct
ok 25 - check result length
ok 26 - tuple update is correct
ok 27 - column_mask is correct
+ *** test_paths ***
+ 1..2
+ ok 1 - JSON update works
+ ok 2 - column mask match
+ok 28 - subtests
+ *** test_paths: done ***
*** main: done ***
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v2 6/8] tuple: make update operation tokens consumable
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
` (4 preceding siblings ...)
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 5/8] tuple: enable JSON bar updates Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 7/8] tuple: JSON updates support intersection by arrays Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 8/8] tuple: JSON updates support intersection by maps Vladislav Shpilevoy
7 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
There is a case: [1][2][3][4] = 100. It is not a problem when it
is a single operation, not intersecting with anything. It is a
bar then, and works ok. But the next patch allows intersections,
and the path [1][2][3][4] can become a tree of arrays.
The root array has its child [1] array. The child has its own
child [2] array, and so on. They each would call
do_op_set(child). But all array update routines expect, that
field_no is already extracted from the patch, and will look at
update_op.field_no to find a target field.
It means, that perhaps each next level of the update [1][2][3][4]
should prepare field_no for the next child? In such a case they
would need to check type of the child if it is an array, which
would complicate code even more: each array update operation
would check child types; each do_op_##optype() would check field
type. Even if it is not enough, there are maps - they double the
complexity.
So this patch goes another way - lets each next level of update
will check if its field_no/map_key is already prepared by the
caller. And if not, extract a next token from the operation path.
Part of #1261
---
src/box/update/update_array.c | 42 +++++++++++++++++++++++++++++++++--
src/box/update/update_field.c | 17 ++++++++++++++
src/box/update/update_field.h | 25 +++++++++++++++++++--
3 files changed, 80 insertions(+), 4 deletions(-)
diff --git a/src/box/update/update_array.c b/src/box/update/update_array.c
index fe50a605a..262eceafb 100644
--- a/src/box/update/update_array.c
+++ b/src/box/update/update_array.c
@@ -32,6 +32,26 @@
#include "msgpuck.h"
#include "fiber.h"
+/**
+ * Make sure @a op contains a valid field number. It can be not
+ * so, if the array's parent didn't propagate operation's lexer.
+ * In fact, the parent fills fieldno only in some rare cases like
+ * branching. Generally, an array should care about fieldno by
+ * itself.
+ */
+static inline int
+update_op_prepare_num_token(struct update_op *op)
+{
+ if (op->token_type == JSON_TOKEN_END &&
+ update_op_consume_token(op) != 0)
+ return -1;
+ if (op->token_type != JSON_TOKEN_NUM) {
+ return update_err(op, "can't update an array by not a "\
+ "number index");
+ }
+ return 0;
+}
+
/**
* Make field number positive and check for the field existence.
* @param op Update operation.
@@ -43,6 +63,7 @@
static inline int
update_op_adjust_field_no(struct update_op *op, int32_t field_count)
{
+ assert(op->token_type == JSON_TOKEN_NUM);
if (op->field_no >= 0) {
if (op->field_no < field_count)
return 0;
@@ -217,10 +238,14 @@ do_op_array_insert(struct update_op *op, struct update_field *field)
assert(field->type == UPDATE_ARRAY);
struct rope *rope = field->array.rope;
struct update_array_item *item;
+ if (update_op_prepare_num_token(op) != 0)
+ return -1;
+
if (! update_op_is_term(op)) {
item = update_array_extract_item(field, op);
if (item == NULL)
return -1;
+ op->token_type = JSON_TOKEN_END;
return do_op_insert(op, &item->field);
}
@@ -241,6 +266,9 @@ do_op_array_set(struct update_op *op, struct update_field *field)
{
assert(field->type == UPDATE_ARRAY);
struct rope *rope = field->array.rope;
+ if (update_op_prepare_num_token(op) != 0)
+ return -1;
+
/* Interpret '=' for n + 1 field as insert. */
if (op->field_no == (int32_t) rope_size(rope))
return do_op_array_insert(op, field);
@@ -249,8 +277,10 @@ do_op_array_set(struct update_op *op, struct update_field *field)
update_array_extract_item(field, op);
if (item == NULL)
return -1;
- if (! update_op_is_term(op))
+ if (! update_op_is_term(op)) {
+ op->token_type = JSON_TOKEN_END;
return do_op_set(op, &item->field);
+ }
op->new_field_len = op->arg.set.length;
/* Ignore the previous op, if any. */
item->field.type = UPDATE_SCALAR;
@@ -262,11 +292,15 @@ int
do_op_array_delete(struct update_op *op, struct update_field *field)
{
assert(field->type == UPDATE_ARRAY);
+ if (update_op_prepare_num_token(op) != 0)
+ return -1;
+
if (! update_op_is_term(op)) {
struct update_array_item *item =
update_array_extract_item(field, op);
if (item == NULL)
return -1;
+ op->token_type = JSON_TOKEN_END;
return do_op_delete(op, &item->field);
}
struct rope *rope = field->array.rope;
@@ -285,12 +319,16 @@ do_op_array_delete(struct update_op *op, struct update_field *field)
int \
do_op_array_##op_type(struct update_op *op, struct update_field *field) \
{ \
+ if (update_op_prepare_num_token(op) != 0) \
+ return -1; \
struct update_array_item *item = \
update_array_extract_item(field, op); \
if (item == NULL) \
return -1; \
- if (! update_op_is_term(op)) \
+ if (! update_op_is_term(op)) { \
+ op->token_type = JSON_TOKEN_END; \
return do_op_##op_type(op, &item->field); \
+ } \
if (item->field.type != UPDATE_NOP) \
return update_err_double(op); \
if (update_op_do_##op_type(op, item->field.data) != 0) \
diff --git a/src/box/update/update_field.c b/src/box/update/update_field.c
index 6baad02dd..db02959ff 100644
--- a/src/box/update/update_field.c
+++ b/src/box/update/update_field.c
@@ -594,6 +594,22 @@ update_op_by(char opcode)
}
}
+int
+update_op_consume_token(struct update_op *op)
+{
+ struct json_token token;
+ int rc = json_lexer_next_token(&op->lexer, &token);
+ if (rc != 0)
+ return update_err_bad_json(op, rc);
+ if (token.type == JSON_TOKEN_END)
+ return update_err_no_such_field(op);
+ op->token_type = token.type;
+ op->key = token.str;
+ op->key_len = token.len;
+ op->field_no = token.num;
+ return 0;
+}
+
int
update_op_decode(struct update_op *op, int index_base,
struct tuple_dictionary *dict, const char **expr)
@@ -622,6 +638,7 @@ update_op_decode(struct update_op *op, int index_base,
diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
return -1;
}
+ op->token_type = JSON_TOKEN_NUM;
int32_t field_no;
switch(mp_typeof(**expr)) {
case MP_INT:
diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h
index d4499eff8..ee42ba5ae 100644
--- a/src/box/update/update_field.h
+++ b/src/box/update/update_field.h
@@ -175,8 +175,18 @@ struct update_op {
const struct update_op_meta *meta;
/** Operation arguments. */
union update_op_arg arg;
- /** First level field no. */
- int32_t field_no;
+ /**
+ * Current level token. END means that it is invalid and
+ * a next token should be extracted from the lexer.
+ */
+ enum json_token_type token_type;
+ union {
+ struct {
+ const char *key;
+ uint32_t key_len;
+ };
+ int32_t field_no;
+ };
/** Size of a new field after it is updated. */
uint32_t new_field_len;
/** Opcode symbol: = + - / ... */
@@ -189,6 +199,17 @@ struct update_op {
struct json_lexer lexer;
};
+/**
+ * Extract a next token from the operation path lexer. The result
+ * is used to decide to which child of a current map/array the
+ * operation should be forwarded. It is not just a synonym to
+ * json_lexer_next_token, because fills some fields of @a op,
+ * and should be used only to chose a next child inside a current
+ * map/array.
+ */
+int
+update_op_consume_token(struct update_op *op);
+
/**
* Decode an update operation from MessagePack.
* @param[out] op Update operation.
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v2 7/8] tuple: JSON updates support intersection by arrays
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
` (5 preceding siblings ...)
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 6/8] tuple: make update operation tokens consumable Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 8/8] tuple: JSON updates support intersection by maps Vladislav Shpilevoy
7 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Before the patch only not intersected JSON updates were supported
as the simplest and fastest to implement. This patch allows paths
to intersect. They can have a common JSON prefix, but difference
should start from an array index.
For example, this is allowed:
[1][2][3].a.b.c = 20
[1][2][4].e.f.g = 30
This is not allowed yet:
[1][2][3].a.b.c = 20
[1][2][3].a.e.f = 30
Here first difference is 'b' vs 'e' - intersection by map
keys.
Architecture of the solution is similar to bar updates: a new
update tree node type is added: UPDATE_ROUTE. When several
update operations have the same prefix, this prefix becomes an
UPDATE_ROUTE tree field. It stores the prefix and a subtree with
these operations.
Bar and route update nodes can branch and produce more bars and
routes, when new operations come.
Part of #1261
---
src/box/CMakeLists.txt | 1 +
src/box/update/update_array.c | 49 +++++
src/box/update/update_bar.c | 9 +-
src/box/update/update_field.c | 4 +
src/box/update/update_field.h | 93 ++++++++++
src/box/update/update_route.c | 334 ++++++++++++++++++++++++++++++++++
test/box/update.result | 177 ++++++++++++++++++
test/box/update.test.lua | 91 +++++++++
8 files changed, 753 insertions(+), 5 deletions(-)
create mode 100644 src/box/update/update_route.c
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 2888f4d7d..67461bd9b 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -44,6 +44,7 @@ add_library(tuple STATIC
update/update_field.c
update/update_array.c
update/update_bar.c
+ update/update_route.c
tuple_compare.cc
tuple_extract_key.cc
tuple_hash.cc
diff --git a/src/box/update/update_array.c b/src/box/update/update_array.c
index 262eceafb..2e354ed42 100644
--- a/src/box/update/update_array.c
+++ b/src/box/update/update_array.c
@@ -190,6 +190,55 @@ update_array_create(struct update_field *field, const char *header,
return rope_append(field->array.rope, item, field_count);
}
+int
+update_array_create_with_child(struct update_field *field,
+ const struct update_field *child,
+ int32_t field_no, const char *header)
+{
+ const char *data = header;
+ uint32_t field_count = mp_decode_array(&data);
+ const char *first_field = data;
+ const char *first_field_end = first_field;
+ mp_next(&first_field_end);
+ struct region *region = &fiber()->gc;
+ struct rope *rope = rope_new(region);
+ if (rope == NULL)
+ return -1;
+ struct update_array_item *item =
+ (struct update_array_item *) rope_alloc(region, sizeof(*item));
+ if (item == NULL)
+ return -1;
+ const char *end = first_field_end;
+ if (field_no > 0) {
+ for (int32_t i = 1; i < field_no; ++i)
+ mp_next(&end);
+ update_array_item_create(item, UPDATE_NOP, first_field,
+ first_field_end - first_field,
+ end - first_field_end);
+ if (rope_append(rope, item, field_no) != 0)
+ return -1;
+ item = (struct update_array_item *) rope_alloc(region,
+ sizeof(*item));
+ if (item == NULL)
+ return -1;
+ first_field = end;
+ first_field_end = first_field;
+ mp_next(&first_field_end);
+ end = first_field_end;
+ }
+ for (uint32_t i = field_no + 1; i < field_count; ++i)
+ mp_next(&end);
+ item->field = *child;
+ update_array_item_create(item, child->type, first_field,
+ first_field_end - first_field,
+ end - first_field_end);
+ field->type = UPDATE_ARRAY;
+ field->data = header;
+ field->size = end - header;
+ field->array.rope = rope;
+ return rope_append(rope, item, field_count - field_no);
+}
+
uint32_t
update_array_sizeof(struct update_field *field)
{
diff --git a/src/box/update/update_bar.c b/src/box/update/update_bar.c
index f4fc00716..c8e28347e 100644
--- a/src/box/update/update_bar.c
+++ b/src/box/update/update_bar.c
@@ -250,12 +250,11 @@ do_op_nop_delete(struct update_op *op, struct update_field *field)
int \
do_op_bar_##op_type(struct update_op *op, struct update_field *field) \
{ \
- (void) op; \
- (void) field; \
assert(field->type == UPDATE_BAR); \
- diag_set(ClientError, ER_UNSUPPORTED, "update", \
- "intersected JSON paths"); \
- return -1; \
+ field = update_route_branch(field, op); \
+ if (field == NULL) \
+ return -1; \
+ return do_op_##op_type(op, field); \
}
DO_SCALAR_OP_GENERIC(insert)
diff --git a/src/box/update/update_field.c b/src/box/update/update_field.c
index db02959ff..161dd5875 100644
--- a/src/box/update/update_field.c
+++ b/src/box/update/update_field.c
@@ -116,6 +116,8 @@ update_field_sizeof(struct update_field *field)
return update_array_sizeof(field);
case UPDATE_BAR:
return update_bar_sizeof(field);
+ case UPDATE_ROUTE:
+ return update_route_sizeof(field);
default:
unreachable();
}
@@ -142,6 +144,8 @@ update_field_store(struct update_field *field, char *out, char *out_end)
return update_array_store(field, out, out_end);
case UPDATE_BAR:
return update_bar_store(field, out, out_end);
+ case UPDATE_ROUTE:
+ return update_route_store(field, out, out_end);
default:
unreachable();
}
diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h
index ee42ba5ae..81c010615 100644
--- a/src/box/update/update_field.h
+++ b/src/box/update/update_field.h
@@ -266,6 +266,22 @@ enum update_type {
* nodes. And this is the most common case.
*/
UPDATE_BAR,
+ /**
+ * Field with a subtree of updates having the same prefix,
+ * stored here explicitly. New updates with the same
+ * prefix just follow it without decoding of JSON nor
+ * MessagePack. It can be quite helpful when an update
+ * works with the same internal object via several
+ * operations like this:
+ *
+ * [1][2].a.b.c[3].key1 = 20
+ * [1][2].a.b.c[3].key2 = 30
+ * [1][2].a.b.c[3].key3 = true
+ *
+ * Here [1][2].a.b.c[3] is stored only once as a route
+ * with a subtree 'key1 = 20', 'key2 = 30', 'key3 = true'.
+ */
+ UPDATE_ROUTE,
};
/**
@@ -343,6 +359,17 @@ struct update_field {
};
};
} bar;
+ /** Route update - path to an update subtree. */
+ struct {
+ /**
+ * Common prefix of all updates in the
+ * subtree.
+ */
+ const char *path;
+ int path_len;
+ /** Update subtree. */
+ struct update_field *next_hop;
+ } route;
};
};
@@ -408,6 +435,31 @@ update_array_create(struct update_field *field, const char *header,
const char *data, const char *data_end,
uint32_t field_count);
+/**
+ * The same as mere create, but the array is created with a first
+ * child already. It allows to make it more effective, because
+ * under the hood a rope is created not as a one huge range which
+ * is then spit in parts, but as two rope nodes from the
+ * beginning. On the summary: -1 rope node split, -1 decoding of
+ * fields to @a field_no.
+ *
+ * The function is used during branching, where there was an
+ * existing update, but another one came with the same prefix, and
+ * a different suffix.
+ *
+ * @param[out] field Field to initialize.
+ * @param child A child subtree.
+ * @param field_no Field number of @a child.
+ * @param header Beginning of the array, MessagePack header.
+ *
+ * @retval 0 Success.
+ * @retval -1 Error.
+ */
+int
+update_array_create_with_child(struct update_field *field,
+ const struct update_field *child,
+ int32_t field_no, const char *header);
+
OP_DECL_GENERIC(array)
/* }}} update_field.array */
@@ -424,6 +476,34 @@ OP_DECL_GENERIC(nop)
/* }}} update_field.nop */
+/* {{{ update_field.route */
+
+/**
+ * Take a bar or a route @a field and split its path in the place
+ * where @a new_op should be applied. Prefix becomes a new route,
+ * suffix becomes a child of the result field. In the result @a
+ * field stays root of its subtree, and a node of that subtree
+ * is returned, to which @a new_op should be applied.
+ *
+ * Note, this function does not apply @a new_op. It just finds to
+ * where it should be applied and does all preparations. It is
+ * done deliberately, because otherwise do_cb() virtual function
+ * of @a new_op would have been called here, since there is no
+ * context. But a caller always knows exactly if it was insert,
+ * set, arith, etc. And a caller can and does use more specific
+ * function like do_op_set, do_op_insert ...
+ *
+ * @param field Field to find where to apply @a new_op.
+ * @param new_op New operation to apply.
+ * @return A field to which @a new_op should be applied.
+ */
+struct update_field *
+update_route_branch(struct update_field *field, struct update_op *new_op);
+
+OP_DECL_GENERIC(route)
+
+/* }}} update_field.route */
+
#undef OP_DECL_GENERIC
/* {{{ Common helpers. */
@@ -435,6 +515,17 @@ OP_DECL_GENERIC(nop)
* Each child can be another array, a bar, a route, a map -
* anything. The functions below help to make such places shorter
* and simpler.
+ *
+ * Note, that they are recursive, although it is not clearly
+ * visible. For example, if an update tree contains several array
+ * nodes on one tree branch, then update of the deepest array goes
+ * through each of these nodes and calls do_op_array_<opname> on
+ * needed children. But it is ok, because operation count is
+ * usually small (<<50 in the most cases, usually <= 5), and the
+ * update tree hight is not bigger than operation count. Also,
+ * fiber stack is big enough to fit ~10k update tree depth -
+ * incredible number, even though the real limit is 4k due to
+ * limited number of operations.
*/
#define OP_DECL_GENERIC(op_type) \
static inline int \
@@ -447,6 +538,8 @@ do_op_##op_type(struct update_op *op, struct update_field *field) \
return do_op_nop_##op_type(op, field); \
case UPDATE_BAR: \
return do_op_bar_##op_type(op, field); \
+ case UPDATE_ROUTE: \
+ return do_op_route_##op_type(op, field); \
default: \
unreachable(); \
} \
diff --git a/src/box/update/update_route.c b/src/box/update/update_route.c
new file mode 100644
index 000000000..f1d2ced65
--- /dev/null
+++ b/src/box/update/update_route.c
@@ -0,0 +1,334 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "update_field.h"
+#include "fiber.h"
+#include "box/tuple.h"
+
+/**
+ * Do the actual branch. This is the case when an existing
+ * bar/route path is different from a new operation's path in an
+ * array.
+ * @param[out] next_hop A field which will be initialized as an
+ * array, and which will be a point to apply a new
+ * operation.
+ * @param child Current field from which the branch happens. It
+ * already contains an old update subtree.
+ * @param parent MessagePack array to be taken by @a next_hop.
+ */
+static int
+update_route_branch_array(struct update_field *next_hop,
+ const struct update_field *child, int32_t field_no,
+ const char *parent)
+{
+ struct update_op *op = child->bar.op;
+ /*
+ * There are rules when a subtree can be just copied as is
+ * from one parent to another.
+ * 1) It should not be a leaf. If it is not leaf, then it
+ * does not change header and other fields of this
+ * particular array, and can be safely copied into one
+ * of its fields. Otherwise check next rules.
+ * 2) It should not be a bar. Because if it is not a bar,
+ * and is a leaf, then it is scalar. Again, changes
+ * only its own field, does not affect anything else,
+ * can be copied. Otherwise go down.
+ * 3) Ok, it is a bar, a leaf. It is a bar with zero path
+ * length. In this case it should be scalar bar, it is
+ * checked below. The only non-scalar operations are
+ * '!' and '#'.
+ *
+ * Why '#' and '!' can't be copied? '!', for example,
+ * being applied to a field [1], affects all fields [2-*],
+ * and the array header. The same but in an even worse
+ * form about '#'. Such operations should be redone. They
+ * affect many fields and the parent.
+ *
+ * There is a tricky thing though - why not to just redo
+ * all operations here, for the sake of code simplicity?
+ * It would allow to remove 'create_with_child' crutch.
+ * The answer is - it is not possible. If a field is
+ * copyable, it is not re-applicable. And vice-versa. For
+ * example, if it is not a leaf, then there may be many
+ * operations, not one. A subtree just can't be
+ * 're-applied'.
+ *
+ * If the operation is scalar and a leaf, then its result
+ * has already overridden its arguments. This is because
+ * scalar operations save result into the arguments, to
+ * save memory. A second operation appliance would lead
+ * to very surprising results.
+ *
+ * Another reason - performance. This path should be
+ * quite hot, and to copy a struct is for sure much faster
+ * than to reapply an operation using a virtual function.
+ * Operations '!' and '#' are quite rare, so their
+ * optimization is not the goal.
+ */
+ if (child->type != UPDATE_BAR || child->bar.path_len > 0 ||
+ (op->opcode != '!' && op->opcode != '#')) {
+ return update_array_create_with_child(next_hop, child,
+ field_no, parent);
+ }
+ op->token_type = JSON_TOKEN_NUM;
+ op->field_no = field_no;
+ const char *data = parent;
+ uint32_t field_count = mp_decode_array(&data);
+ const char *end = data;
+ for (uint32_t i = 0; i < field_count; ++i)
+ mp_next(&end);
+ if (update_array_create(next_hop, parent, data, end, field_count) != 0)
+ return -1;
+ return op->meta->do_cb(op, next_hop);
+}
+
+struct update_field *
+update_route_branch(struct update_field *field, struct update_op *new_op)
+{
+ assert(new_op->lexer.src != NULL);
+ const char *old_path;
+ int old_path_len;
+ if (field->type == UPDATE_BAR) {
+ old_path = field->bar.path;
+ old_path_len = field->bar.path_len;
+ } else {
+ assert(field->type == UPDATE_ROUTE);
+ old_path = field->route.path;
+ old_path_len = field->route.path_len;
+ }
+ assert(old_path != NULL);
+ struct json_lexer old_path_lexer;
+ struct json_token old_token, new_token;
+ /*
+ * Offset is going to be used as a length of the route
+ * node created as a parent of the old subtree and the
+ * new operation. As it is described in the route
+ * definition in struct update_field - route is the common
+ * prefix of all operations of the subtree. Here its
+ * length is calculated.
+ * In addition, it is used here to determine if the new
+ * operation is different from the current subtree from
+ * the very beginning. Basically, it means the offset is
+ * 0, and no route is generated. Root becomes a regular
+ * update field (array, map), not a route. Example:
+ * @a field is a bar '[1].a.b = 20', @a new_op is
+ * '[2].c.d = 30'. In this case the paths are already
+ * different from the beginning, no common prefix = no
+ * route. Array with children [1].a.b and [2].c.d becomes
+ * a root.
+ */
+ int saved_old_offset;
+ json_lexer_create(&old_path_lexer, old_path, old_path_len,
+ TUPLE_INDEX_BASE);
+ const char *parent = field->data;
+ int rc;
+ do {
+ saved_old_offset = old_path_lexer.offset;
+ rc = json_lexer_next_token(&old_path_lexer, &old_token);
+ /* Old path is already validated. */
+ assert(rc == 0);
+ rc = json_lexer_next_token(&new_op->lexer, &new_token);
+ if (rc != 0) {
+ update_err_bad_json(new_op, rc);
+ return NULL;
+ }
+ if (json_token_cmp(&old_token, &new_token) != 0)
+ break;
+ switch(new_token.type) {
+ case JSON_TOKEN_NUM:
+ rc = tuple_field_go_to_index(&parent, new_token.num);
+ break;
+ case JSON_TOKEN_STR:
+ rc = tuple_field_go_to_key(&parent, new_token.str,
+ new_token.len);
+ break;
+ default:
+ /*
+ * Can't be ANY, because old and new
+ * tokens are equal, but '*' is invalid in
+ * paths and old was already checked for
+ * that.
+ */
+ assert(new_token.type == JSON_TOKEN_END);
+ update_err_double(new_op);
+ return NULL;
+ }
+ /*
+ * Must always find a field, because old
+ * token already went that path.
+ */
+ assert(rc == 0);
+ } while (true);
+ enum mp_type type = mp_typeof(*parent);
+ /*
+ * The paths are different from the very beginning. It
+ * means, that the old field should be transformed instead
+ * of creating a new route node.
+ */
+ bool transform_root = saved_old_offset == 0;
+ struct update_field *next_hop;
+ if (! transform_root) {
+ next_hop = (struct update_field *)
+ region_alloc(&fiber()->gc, sizeof(*next_hop));
+ if (next_hop == NULL) {
+ diag_set(OutOfMemory, sizeof(*next_hop), "region_alloc",
+ "next_hop");
+ return NULL;
+ }
+ } else {
+ next_hop = field;
+ }
+ if (type == MP_MAP) {
+ diag_set(ClientError, ER_UNSUPPORTED, "update",
+ "path intersection on map");
+ return NULL;
+ }
+ if (new_token.type != JSON_TOKEN_NUM) {
+ update_err(new_op, "can not update array by non-integer index");
+ return NULL;
+ }
+ if (type != MP_ARRAY) {
+ update_err_no_such_field(new_op);
+ return NULL;
+ }
+
+ int path_offset = old_path_lexer.offset;
+ struct update_field child = *field;
+ if (child.type == UPDATE_ROUTE) {
+ child.route.path += path_offset;
+ child.route.path_len -= path_offset;
+ if (child.route.path_len == 0)
+ child = *child.route.next_hop;
+ } else {
+ assert(child.type == UPDATE_BAR);
+ child.bar.path += path_offset;
+ child.bar.path_len -= path_offset;
+ /*
+ * Yeah, bar length can become 0 here, but it is
+ * ok as long as it is not a non-scalar operation
+ * like '!' or '#'. In these cases it changes the
+ * parent too. When a bar is scalar ('=', arith,
+ * splice, ...), it operates on one concrete field
+ * and works even if its path len is 0. Of course,
+ * it is not a big deal to fix such bar here
+ * making it scalar, but it would be additional
+ * branching on an actually hot path.
+ * Talking of '#' and '!' - they are handled by
+ * array and map 'branchers' internally, below.
+ */
+ }
+
+ new_op->token_type = JSON_TOKEN_NUM;
+ new_op->field_no = new_token.num;
+ if (update_route_branch_array(next_hop, &child, old_token.num,
+ parent) != 0)
+ return NULL;
+ if (! transform_root) {
+ field->type = UPDATE_ROUTE;
+ field->route.path = old_path;
+ field->route.path_len = saved_old_offset;
+ field->route.next_hop = next_hop;
+ }
+ return next_hop;
+}
+
+/**
+ * Obtain a next node of the update tree to which @a op should be
+ * propagated. It is the same as branch, but has a fast path in
+ * case of @a field is route, and operation prefix matches this
+ * route - then no need to parse JSON and dive into MessagePack,
+ * the route is just followed, via a lexer offset increase.
+ *
+ * @param field Field to find where to apply @a op.
+ * @param op New operation to apply.
+ * @return A field to which @a op should be applied.
+ */
+static struct update_field *
+update_route_next(struct update_field *field, struct update_op *op)
+{
+ assert(field->type == UPDATE_ROUTE);
+ assert(! update_op_is_term(op));
+ const char *new_path = op->lexer.src + op->lexer.offset;
+ int new_path_len = op->lexer.src_len - op->lexer.offset;
+ if (field->route.path_len <= new_path_len &&
+ memcmp(field->route.path, new_path, field->route.path_len) == 0) {
+ /*
+ * Fast path: jump to the next hop with no
+ * decoding. Is used, when several JSON updates
+ * have the same prefix.
+ */
+ op->lexer.offset += field->route.path_len;
+ return field->route.next_hop;
+ }
+ return update_route_branch(field, op);
+}
+
+#define DO_SCALAR_OP_GENERIC(op_type) \
+int \
+do_op_route_##op_type(struct update_op *op, struct update_field *field) \
+{ \
+ assert(field->type == UPDATE_ROUTE); \
+ struct update_field *next_hop = update_route_next(field, op); \
+ if (next_hop == NULL) \
+ return -1; \
+ return do_op_##op_type(op, next_hop); \
+}
+
+DO_SCALAR_OP_GENERIC(set)
+
+DO_SCALAR_OP_GENERIC(insert)
+
+DO_SCALAR_OP_GENERIC(delete)
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
+
+uint32_t
+update_route_sizeof(struct update_field *field)
+{
+ return field->size - field->route.next_hop->size +
+ update_field_sizeof(field->route.next_hop);
+}
+
+uint32_t
+update_route_store(struct update_field *field, char *out, char *out_end)
+{
+ char *saved_out = out;
+ int before_hop = field->route.next_hop->data - field->data;
+ memcpy(out, field->data, before_hop);
+ out += before_hop;
+ out += update_field_store(field->route.next_hop, out, out_end);
+ int after_hop = before_hop + field->route.next_hop->size;
+ memcpy(out, field->data + after_hop, field->size - after_hop);
+ return out + field->size - after_hop - saved_out;
+}
diff --git a/test/box/update.result b/test/box/update.result
index 38b8bc13e..2c5978d8a 100644
--- a/test/box/update.result
+++ b/test/box/update.result
@@ -1294,6 +1294,183 @@ vy_s:select()
vy_s:drop()
---
...
+--
+-- Complex updates. Intersected paths, different types.
+--
+--
+-- =
+--
+-- Difference in the last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][1][2]', 375}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[325, 375], [400,
+ 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+-- Different starting from the non-last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][2][2]', 475}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+ 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[325, 350], [400,
+ 475]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+-- Contains string keys in the middle.
+t:update({{'=', '[2].c.f[1]', 4000}, {'=', '[2].c.f[2]', 5000}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4000, 5000, 6, 7, 8],
+ 'e': 500, 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350],
+ [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t = {1}
+---
+...
+t[2] = setmetatable({}, {__serialize = 'map'})
+---
+...
+t[3] = {}
+---
+...
+t[4] = { \
+ 1, \
+ 2, \
+ 3, \
+ { \
+ 4, \
+ 5, \
+ 6, \
+ 7, \
+ { \
+ 8, \
+ 9, \
+ {10, 11, 12}, \
+ 13, \
+ 14, \
+ }, \
+ 15, \
+ 16, \
+ {17, 18, 19} \
+ }, \
+ 20, \
+ { \
+ a = 21, \
+ b = 22, \
+ c = { \
+ d = 23, \
+ e = 24 \
+ } \
+ }, \
+ 25 \
+}
+---
+...
+t = s:replace(t)
+---
+...
+-- Non-first and non-last field updates.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][8][3]', 19000}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11000, 12], 13, 14], 15, 16, [17,
+ 18, 19000]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Triple intersection with enabled fast jump by route.
+t:update({{'=', '[4][4][3]', 6000}, {'=', '[4][4][5][2]', 9000}, {'=', '[4][4][8][2]', 18000}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6000, 7, [8, 9000, [10, 11, 12], 13, 14], 15, 16, [
+ 17, 18000, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Triple intersection with no optimizations.
+t:update({{'=', '[4][3]', 3000}, {'=', '[4][4][2]', 5000}, {'=', '[4][4][5][1]', 8000}})
+---
+- [1, {}, [], [1, 2, 3000, [4, 5000, 6, 7, [8000, 9, [10, 11, 12], 13, 14], 15, 16,
+ [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Last operation splits previous ones.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][5][3][1]', 10000}, {'=', '[4][4][3]', 6000}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6000, 7, [8, 9, [10000, 11000, 12], 13, 14], 15, 16,
+ [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- First two operations generate a route '[4]' in the 4th field of
+-- the tuple. With childs [1] and [2]. The last operation should
+-- delete the route and replace it with a full featured array.
+t:update({{'=', '[4][4][1]', 4.5}, {'=', '[4][4][2]', 5.5}, {'=', '[4][3]', 3.5}})
+---
+- [1, {}, [], [1, 2, 3.5, [4.5, 5.5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [
+ 17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Test errors.
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][5].a', -1}})
+---
+- error: 'Field ''[4][4][5].a'' UPDATE error: can not update array by non-integer
+ index'
+...
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][*].a', -1}})
+---
+- error: 'Field ''[4][4][*].a'' UPDATE error: can not update array by non-integer
+ index'
+...
+t:update({{'=', '[4][4][*].b', 8000}, {'=', '[4][4][*].a', -1}})
+---
+- error: 'Field ''[4][4][*].b'' UPDATE error: invalid JSON in position 8'
+...
+t:update({{'=', '[4][4][1]', 4000}, {'=', '[4][4]-badjson', -1}})
+---
+- error: 'Field ''[4][4]-badjson'' UPDATE error: invalid JSON in position 7'
+...
+t:update({{'=', '[4][4][1]', 1}, {'=', '[4][4][1]', 2}})
+---
+- error: 'Field ''[4][4][1]'' UPDATE error: double update of the same field'
+...
+-- First two operations produce zero length bar update in field
+-- [4][4][5][4]. The third operation should raise an error.
+t:update({{'+', '[4][4][5][4]', 13000}, {'=', '[4][4][5][1]', 8000}, {'+', '[4][4][5][4]', 1}})
+---
+- error: 'Field ''[4][4][5][4]'' UPDATE error: double update of the same field'
+...
+--
+-- !
+--
+t:update({{'!', '[4][5]', 19.5}, {'!', '[4][6]', 19.75}, {'!', '[4][4][3]', 5.5}, {'=', '[4][4][2]', 5.25}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5.25, 5.5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16,
+ [17, 18, 19]], 19.5, 19.75, 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}},
+ 25]]
+...
+--
+-- #
+--
+t:update({{'#', '[4][4][5]', 1}, {'#', '[4][4][4]', 1}, {'#', '[4][4][5]', 1}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6, 15, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {
+ 'd': 23, 'e': 24}}, 25]]
+...
+--
+-- Scalar operations.
+--
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}})
+---
+- [1, {}, [], [1, 2, 3.5, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14.75], 15, 16, [
+ 17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}, {'+', '[4][4][3]', 0.25}})
+---
+- [1, {}, [], [1, 2, 3.5, [4, 5, 6.25, 7, [8, 9, [10, 11, 12], 13, 14.75], 15, 16,
+ [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Combined - check that operations following '!' take into
+-- account that there is a new element. Indexes should be relative
+-- to the new array size.
+t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]', 0.25}})
+---
+- [1, {}, [], [1, 2, 2.5, 3, [4, 5, 6.25, 7, [8, 9, [10, 11, 12], 13, 14.75], 15,
+ 16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Error - an attempt to follow JSON path in a scalar field
+-- during branching.
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
+---
+- error: Field ''[4][3][2]'' was not found in the tuple
+...
s:drop()
---
...
diff --git a/test/box/update.test.lua b/test/box/update.test.lua
index 60e669d27..d91a2e95c 100644
--- a/test/box/update.test.lua
+++ b/test/box/update.test.lua
@@ -424,4 +424,95 @@ box.commit()
vy_s:select()
vy_s:drop()
+--
+-- Complex updates. Intersected paths, different types.
+--
+
+--
+-- =
+--
+-- Difference in the last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][1][2]', 375}})
+-- Different starting from the non-last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][2][2]', 475}})
+-- Contains string keys in the middle.
+t:update({{'=', '[2].c.f[1]', 4000}, {'=', '[2].c.f[2]', 5000}})
+t = {1}
+t[2] = setmetatable({}, {__serialize = 'map'})
+t[3] = {}
+t[4] = { \
+ 1, \
+ 2, \
+ 3, \
+ { \
+ 4, \
+ 5, \
+ 6, \
+ 7, \
+ { \
+ 8, \
+ 9, \
+ {10, 11, 12}, \
+ 13, \
+ 14, \
+ }, \
+ 15, \
+ 16, \
+ {17, 18, 19} \
+ }, \
+ 20, \
+ { \
+ a = 21, \
+ b = 22, \
+ c = { \
+ d = 23, \
+ e = 24 \
+ } \
+ }, \
+ 25 \
+}
+t = s:replace(t)
+-- Non-first and non-last field updates.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][8][3]', 19000}})
+-- Triple intersection with enabled fast jump by route.
+t:update({{'=', '[4][4][3]', 6000}, {'=', '[4][4][5][2]', 9000}, {'=', '[4][4][8][2]', 18000}})
+-- Triple intersection with no optimizations.
+t:update({{'=', '[4][3]', 3000}, {'=', '[4][4][2]', 5000}, {'=', '[4][4][5][1]', 8000}})
+-- Last operation splits previous ones.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][5][3][1]', 10000}, {'=', '[4][4][3]', 6000}})
+-- First two operations generate a route '[4]' in the 4th field of
+-- the tuple. With childs [1] and [2]. The last operation should
+-- delete the route and replace it with a full featured array.
+t:update({{'=', '[4][4][1]', 4.5}, {'=', '[4][4][2]', 5.5}, {'=', '[4][3]', 3.5}})
+-- Test errors.
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][5].a', -1}})
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][*].a', -1}})
+t:update({{'=', '[4][4][*].b', 8000}, {'=', '[4][4][*].a', -1}})
+t:update({{'=', '[4][4][1]', 4000}, {'=', '[4][4]-badjson', -1}})
+t:update({{'=', '[4][4][1]', 1}, {'=', '[4][4][1]', 2}})
+-- First two operations produce zero length bar update in field
+-- [4][4][5][4]. The third operation should raise an error.
+t:update({{'+', '[4][4][5][4]', 13000}, {'=', '[4][4][5][1]', 8000}, {'+', '[4][4][5][4]', 1}})
+
+--
+-- !
+--
+t:update({{'!', '[4][5]', 19.5}, {'!', '[4][6]', 19.75}, {'!', '[4][4][3]', 5.5}, {'=', '[4][4][2]', 5.25}})
+--
+-- #
+--
+t:update({{'#', '[4][4][5]', 1}, {'#', '[4][4][4]', 1}, {'#', '[4][4][5]', 1}})
+--
+-- Scalar operations.
+--
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}})
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}, {'+', '[4][4][3]', 0.25}})
+-- Combined - check that operations following '!' take into
+-- account that there is a new element. Indexes should be relative
+-- to the new array size.
+t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]', 0.25}})
+-- Error - an attempt to follow JSON path in a scalar field
+-- during branching.
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
+
s:drop()
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v2 8/8] tuple: JSON updates support intersection by maps
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
` (6 preceding siblings ...)
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 7/8] tuple: JSON updates support intersection by arrays Vladislav Shpilevoy
@ 2019-08-31 21:35 ` Vladislav Shpilevoy
7 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-31 21:35 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Previous commits introduced JSON updates with any non-intersecting
JSON paths, and allowed intersection by arrays. This one completes
the puzzle, now any JSON updates are available.
Closes #1261
@TarantoolBot document
Title: JSON updates
Tuple/space/index:update/upsert now support JSON paths. All the
same paths as allowed in tuple["..."]. Including field names in
the first JSON path token.
Example:
box.cfg{}
format = {}
format[1] = {'field1', 'unsigned'}
format[2] = {'field2', 'map'}
format[3] = {'field3', 'array'}
format[4] = {'field4', 'string', is_nullable = true}
s = box.schema.create_space('test', {format = format})
_ = s:create_index('pk')
t = {
1,
{
key1 = 'value',
key2 = 10
},
{
2,
3,
{key3 = 20}
}
}
t = s:replace(t)
tarantool> t:update({{'=', 'field2.key1', 'new_value'}})
---
- [1, {'key1': 'new_value', 'key2': 10}, [2, 3, {'key3': 20}]]
...
tarantool> t:update({{'+', 'field3[2]', 1}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 4, {'key3': 20}]]
...
tarantool> s:update({1}, {{'!', 'field4', 'inserted value'}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 3, {'key3': 20}], 'inserted value']
...
tarantool> s:update({1}, {{'#', '[2].key2', 1}, {'=', '[3][3].key4', 'value4'}})
---
- [1, {'key1': 'value'}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...
tarantool> s:upsert({1, {k = 'v'}, {}}, {{'#', '[2].key1', 1}})
---
...
tarantool> s:select{}
---
- - [1, {}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...
Note, that there is the same rule, as in tuple field access by
JSON, for field names looking like JSON paths. The rule is that
firstly the whole path is interpreted as a field name. If such a
name does not exist, then it is treated as a path. For example,
if there is a field name 'field.name.like.json', then this update
<obj>:update({..., 'field.name.like.json', ...})
will update this field, instead of keys 'field' -> 'name' ->
'like' -> 'json'. If such a name is needed as a part of a bigger
path, then it should be wrapped in quotes and []:
<obj>:update({..., '["field.name.like.json"].next.fields', ...})
There are some new rules for JSON updates:
- Operation '!' can't be used to create all intermediate nodes of
a path. For example, {'!', 'field1[1].field3', ...} can't
create fields 'field1' and '[1]', they should exist.
- Operation '#', when applied to maps, can't delete more than one
key at once. That is, its argument should be always 1 for maps.
{'#', 'field1.field2', 1} - this is allowed;
{'#', 'field1.field2', 10} - this is not.
That limitation originates from a problem, that keys in a map
are not ordered anyhow, and '#' with more than 1 key would lead
to undefined behaviour.
- Operation '!' on maps can't create a key, if it exists already.
- If a map contains non-string keys (booleans, numbers, maps,
arrays - anything), then these keys can't be updated via JSON
paths. But it is still allowed to update string keys in such a
map.
Why JSON updates are good, and should be preferred when only a
part of a tuple needs to be updated?
- They consume less space in WAL, because for an update only its
keys, operations, and arguments are stored. It is cheaper to
store update of one deep field, than the whole tuple subtree.
- They are faster. Firstly, this is because they are implemented
in C, and have no problems with Lua GC and dynamic typing.
Secondly, some cases of JSON paths are highly optimized. For
example, an update with a single JSON path costs O(1) memory
regardless of how deep that path goes (not counting update
arguments).
- They are available from remote clients, as well as any other
DML. Before JSON updates to update one deep part of a tuple it
would be necessary to download that tuple, update it in memory,
send it back - 2 network hops. With JSON paths it can be 1 when
the update can be described in paths.
---
src/box/CMakeLists.txt | 1 +
src/box/update/update_field.c | 4 +
src/box/update/update_field.h | 84 ++++++-
src/box/update/update_map.c | 441 ++++++++++++++++++++++++++++++++++
src/box/update/update_route.c | 74 ++++--
test/box/update.result | 152 ++++++++++++
test/box/update.test.lua | 75 ++++++
7 files changed, 813 insertions(+), 18 deletions(-)
create mode 100644 src/box/update/update_map.c
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 67461bd9b..a68231189 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -45,6 +45,7 @@ add_library(tuple STATIC
update/update_array.c
update/update_bar.c
update/update_route.c
+ update/update_map.c
tuple_compare.cc
tuple_extract_key.cc
tuple_hash.cc
diff --git a/src/box/update/update_field.c b/src/box/update/update_field.c
index 161dd5875..fdc1a2512 100644
--- a/src/box/update/update_field.c
+++ b/src/box/update/update_field.c
@@ -118,6 +118,8 @@ update_field_sizeof(struct update_field *field)
return update_bar_sizeof(field);
case UPDATE_ROUTE:
return update_route_sizeof(field);
+ case UPDATE_MAP:
+ return update_map_sizeof(field);
default:
unreachable();
}
@@ -146,6 +148,8 @@ update_field_store(struct update_field *field, char *out, char *out_end)
return update_bar_store(field, out, out_end);
case UPDATE_ROUTE:
return update_route_store(field, out, out_end);
+ case UPDATE_MAP:
+ return update_map_store(field, out, out_end);
default:
unreachable();
}
diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h
index 81c010615..c3a9b6d97 100644
--- a/src/box/update/update_field.h
+++ b/src/box/update/update_field.h
@@ -33,6 +33,7 @@
#include "trivia/util.h"
#include "tt_static.h"
#include <stdint.h>
+#include "salad/stailq.h"
#include "json/json.h"
#include "bit/int96.h"
#include "mp_decimal.h"
@@ -245,7 +246,8 @@ enum update_type {
* for example, when a rope is split in two parts: not
* changed left range of fields, and a right range with
* its first field changed. In this case the left range is
- * NOP.
+ * NOP. And when a map is updated and split into rages,
+ * when only certain points are not NOP.
*/
UPDATE_NOP,
/**
@@ -282,6 +284,11 @@ enum update_type {
* with a subtree 'key1 = 20', 'key2 = 30', 'key3 = true'.
*/
UPDATE_ROUTE,
+ /**
+ * Field is an updated map. Check item list for updates of
+ * individual fields.
+ */
+ UPDATE_MAP,
};
/**
@@ -370,6 +377,48 @@ struct update_field {
/** Update subtree. */
struct update_field *next_hop;
} route;
+ /**
+ * The field is an updated map. Individual fields
+ * are stored very similar to array update and its
+ * rope nodes. Each item is a key, a value, and a
+ * tail of unchanged key-value pairs. The items
+ * are stored in a list. But the list is not
+ * sorted anyhow by keys, because it does not
+ * really make any sense:
+ *
+ * 1) Keys in MessagePack are not sorted anyway,
+ * and any kind of search would not be possible
+ * even if they were sorted. Sort of a map
+ * would require Nlog(N) time and N memory even
+ * if only a few fields were updated.
+ *
+ * 2) Double scalar update of the same key is not
+ * possible.
+ *
+ * Although there is something which can be and is
+ * optimized. When a key is updated first time,
+ * it is moved to the beginning of the list, and
+ * after all operations are done, it is stored
+ * in the result tuple before unchanged fields. On
+ * a second update of the same tuple it is found
+ * immediately.
+ */
+ struct {
+ /**
+ * List of map update items. Each item is
+ * a key, a value, and an unchanged tail.
+ */
+ struct stailq items;
+ /**
+ * Number of key-value pairs in the list.
+ * Note, it is not a number of items, but
+ * exactly number of key-value pairs. It
+ * is used to store MessagePack map header
+ * without decoding each item again just
+ * to learn the size.
+ */
+ int size;
+ } map;
};
};
@@ -464,6 +513,37 @@ OP_DECL_GENERIC(array)
/* }}} update_field.array */
+/* {{{ update_field.map */
+
+/**
+ * Initialize @a field as a map to update.
+ * @param[out] field Field to initialize.
+ * @param header Header of the MessagePack map @a data.
+ * @param data MessagePack data of the map to update.
+ * @param data_end End of @a data.
+ * @param field_count Key-value pair count in @data.
+ *
+ * @retval 0 Success.
+ * @retval -1 Error.
+ */
+int
+update_map_create(struct update_field *field, const char *header,
+ const char *data, const char *data_end, int field_count);
+
+/**
+ * Create a map update with an existing child. Motivation is
+ * exactly the same as with a similar array constructor. It allows
+ * to avoid unnecessary MessagePack decoding.
+ */
+int
+update_map_create_with_child(struct update_field *field,
+ const struct update_field *child, const char *key,
+ uint32_t key_len, const char *header);
+
+OP_DECL_GENERIC(map)
+
+/* }}} update_field.map */
+
/* {{{ update_field.bar */
OP_DECL_GENERIC(bar)
@@ -540,6 +620,8 @@ do_op_##op_type(struct update_op *op, struct update_field *field) \
return do_op_bar_##op_type(op, field); \
case UPDATE_ROUTE: \
return do_op_route_##op_type(op, field); \
+ case UPDATE_MAP: \
+ return do_op_map_##op_type(op, field); \
default: \
unreachable(); \
} \
diff --git a/src/box/update/update_map.c b/src/box/update/update_map.c
new file mode 100644
index 000000000..14838404e
--- /dev/null
+++ b/src/box/update/update_map.c
@@ -0,0 +1,441 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "update_field.h"
+#include "fiber.h"
+#include "small/region.h"
+#include "box/tuple_format.h"
+
+/**
+ * Descriptor of one updated key-value pair. Besides updated data
+ * it contains a tail with unchanged pairs, just so as not to
+ * create a separate object for them, and to be similar with array
+ * update items.
+ */
+struct update_map_item {
+ /**
+ * Updated key. Can be NULL. In such a case this item
+ * contains only an unchanged tail. A key can be nullified
+ * if it was removed from the map, or when a map is just
+ * created and has no any update yet.
+ */
+ const char *key;
+ /** Length of @a key. */
+ uint32_t key_len;
+ /** Updated value. */
+ struct update_field field;
+ /** Link in the list of updated map keys. */
+ struct stailq_entry in_items;
+ /**
+ * Size in bytes of unchanged tail data. It goes right
+ * after @a field.data.
+ */
+ uint32_t tail_size;
+};
+
+static inline struct update_map_item *
+update_map_item_alloc(void)
+{
+ struct update_map_item *item = (struct update_map_item *)
+ region_alloc(&fiber()->gc, sizeof(*item));
+ if (item == NULL)
+ diag_set(OutOfMemory, sizeof(*item), "region_alloc", "item");
+ return item;
+}
+
+static void
+update_map_create_item(struct update_field *field, struct update_map_item *item,
+ enum update_type type, const char *key, uint32_t key_len,
+ const char *data, uint32_t data_size, uint32_t tail_size)
+{
+ assert(field->type == UPDATE_MAP);
+ item->key = key;
+ item->key_len = key_len;
+ item->field.type = type;
+ item->field.data = data;
+ item->field.size = data_size,
+ item->tail_size = tail_size;
+ /*
+ * Each time a new item it created it is stored in the
+ * head of update map item list. It helps in case the
+ * tuple is regularly updated, because on all next updates
+ * this key will be found from the very beginning of the
+ * map.
+ */
+ stailq_add_entry(&field->map.items, item, in_items);
+}
+
+static inline struct update_map_item *
+update_map_new_item(struct update_field *field, enum update_type type,
+ const char *key, uint32_t key_len, const char *data,
+ uint32_t data_size, uint32_t tail_size)
+{
+ struct update_map_item *item = update_map_item_alloc();
+ if (item != NULL) {
+ update_map_create_item(field, item, type, key, key_len, data,
+ data_size, tail_size);
+ }
+ return item;
+}
+
+/**
+ * Find an update item to which @a op should be applied. The
+ * target field may do not exist, but at least its parent should.
+ */
+static int
+update_map_extract_opt_item(struct update_field *field, struct update_op *op,
+ struct update_map_item **res)
+{
+ assert(field->type == UPDATE_MAP);
+ if (op->token_type == JSON_TOKEN_END) {
+ if (update_op_consume_token(op) != 0)
+ return -1;
+ if (op->token_type != JSON_TOKEN_STR) {
+ return update_err(op, "can't update a map by not a "\
+ "string key");
+ }
+ }
+ struct stailq *items = &field->map.items;
+ struct update_map_item *i, *new_item;
+ /*
+ * Firstly, try to find the key among already updated
+ * ones. Perhaps, the needed node is just an intermediate
+ * key of a bigger JSON path, and there are many updates
+ * passing this key, so it should be here for all except
+ * first updates.
+ */
+ stailq_foreach_entry(i, items, in_items) {
+ if (i->key != NULL && i->key_len == op->key_len &&
+ memcmp(i->key, op->key, i->key_len) == 0) {
+ *res = i;
+ return 0;
+ }
+ }
+ /*
+ * Slow path - the key is updated first time, need to
+ * decode tails.
+ */
+ uint32_t key_len, i_tail_size;
+ const char *pos, *key, *end, *tmp, *begin;
+ stailq_foreach_entry(i, items, in_items) {
+ begin = i->field.data + i->field.size;
+ pos = begin;
+ end = begin + i->tail_size;
+ i_tail_size = 0;
+ while(pos < end) {
+ if (mp_typeof(*pos) != MP_STR) {
+ mp_next(&pos);
+ mp_next(&pos);
+ continue;
+ }
+ i_tail_size = pos - begin;
+ key = mp_decode_str(&pos, &key_len);
+ if (key_len == op->key_len &&
+ memcmp(key, op->key, key_len) == 0)
+ goto key_is_found;
+ mp_next(&pos);
+ }
+ }
+ *res = NULL;
+ return 0;
+
+key_is_found:
+ tmp = pos;
+ mp_next(&tmp);
+ if (i_tail_size == 0 && i->key == NULL) {
+ /*
+ * Looks like the needed key was found from the
+ * beginning of tail of an item without a key.
+ * This is actually good, because this item can
+ * be transformed instead of a new item creation.
+ */
+ i->key = op->key;
+ i->key_len = op->key_len;
+ i->field.data = pos;
+ i->field.size = tmp - pos;
+ i->tail_size = end - tmp;
+ new_item = i;
+ } else {
+ new_item = update_map_new_item(field, UPDATE_NOP, op->key,
+ op->key_len, pos, tmp - pos,
+ end - tmp);
+ if (new_item == NULL)
+ return -1;
+ i->tail_size = i_tail_size;
+ }
+ *res = new_item;
+ return 0;
+}
+
+/**
+ * The same as optional extractor, but the field to update should
+ * exist. It is the case of all scalar operations (except '=' - it
+ * can work as insert).
+ */
+static inline struct update_map_item *
+update_map_extract_item(struct update_field *field, struct update_op *op)
+{
+ assert(field->type == UPDATE_MAP);
+ struct update_map_item *res;
+ if (update_map_extract_opt_item(field, op, &res) != 0)
+ return NULL;
+ if (res == NULL)
+ update_err_no_such_field(op);
+ return res;
+}
+
+int
+do_op_map_insert(struct update_op *op, struct update_field *field)
+{
+ assert(field->type == UPDATE_MAP);
+ struct update_map_item *item;
+ if (update_map_extract_opt_item(field, op, &item) != 0)
+ return -1;
+ if (! update_op_is_term(op)) {
+ if (item == NULL)
+ return update_err_no_such_field(op);
+ op->token_type = JSON_TOKEN_END;
+ return do_op_insert(op, &item->field);
+ }
+ if (item != NULL)
+ return update_err_duplicate(op);
+ ++field->map.size;
+ item = update_map_new_item(field, UPDATE_NOP, op->key, op->key_len,
+ op->arg.set.value, op->arg.set.length, 0);
+ return item != NULL ? 0 : -1;
+}
+
+int
+do_op_map_set(struct update_op *op, struct update_field *field)
+{
+ assert(field->type == UPDATE_MAP);
+ struct update_map_item *item;
+ if (update_map_extract_opt_item(field, op, &item) != 0)
+ return -1;
+ if (! update_op_is_term(op)) {
+ if (item == NULL)
+ return update_err_no_such_field(op);
+ op->token_type = JSON_TOKEN_END;
+ return do_op_set(op, &item->field);
+ }
+ if (item != NULL) {
+ op->new_field_len = op->arg.set.length;
+ /* Ignore the previous op, if any. */
+ item->field.type = UPDATE_SCALAR;
+ item->field.scalar.op = op;
+ return 0;
+ }
+ ++field->map.size;
+ item = update_map_new_item(field, UPDATE_NOP, op->key, op->key_len,
+ op->arg.set.value, op->arg.set.length, 0);
+ return item != NULL ? 0 : -1;
+}
+
+int
+do_op_map_delete(struct update_op *op, struct update_field *field)
+{
+ assert(field->type == UPDATE_MAP);
+ struct update_map_item *item;
+ if (update_map_extract_opt_item(field, op, &item) != 0)
+ return -1;
+ if (! update_op_is_term(op)) {
+ if (item == NULL)
+ return update_err_no_such_field(op);
+ op->token_type = JSON_TOKEN_END;
+ return do_op_delete(op, &item->field);
+ }
+ if (op->arg.del.count != 1)
+ return update_err_delete1(op);
+ if (item == NULL)
+ return 0;
+ /*
+ * Note, even if tail size becomes 0, this item is not
+ * deleted. This is because items are linked via stailq,
+ * elements of which can't be deleted as simple as that.
+ * But it is not a big deal, because '#' is a really rare
+ * operation.
+ * Why just a next key from the tail can't be decoded?
+ * Why key should be NULL here? This is because the JSON
+ * updates allow to update a map containing non-string
+ * keys. If the next key is not a string, it can't be
+ * saved as key of the item. Therefore, it is better not
+ * to touch unchanged tails unless a new update operation
+ * needs it.
+ */
+ item->key = NULL;
+ item->key_len = 0;
+ item->field.data += item->field.size;
+ item->field.size = 0;
+ item->field.type = UPDATE_NOP;
+ --field->map.size;
+ return 0;
+}
+
+#define DO_SCALAR_OP_GENERIC(op_type) \
+int \
+do_op_map_##op_type(struct update_op *op, struct update_field *field) \
+{ \
+ assert(field->type == UPDATE_MAP); \
+ struct update_map_item *item = \
+ update_map_extract_item(field, op); \
+ if (item == NULL) \
+ return -1; \
+ if (! update_op_is_term(op)) { \
+ op->token_type = JSON_TOKEN_END; \
+ return do_op_##op_type(op, &item->field); \
+ } \
+ if (item->field.type != UPDATE_NOP) \
+ return update_err_double(op); \
+ if (update_op_do_##op_type(op, item->field.data) != 0) \
+ return -1; \
+ item->field.type = UPDATE_SCALAR; \
+ item->field.scalar.op = op; \
+ return 0; \
+}
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
+
+int
+update_map_create(struct update_field *field, const char *header,
+ const char *data, const char *data_end, int field_count)
+{
+ field->type = UPDATE_MAP;
+ field->data = header;
+ field->size = data_end - header;
+ field->map.size = field_count;
+ stailq_create(&field->map.items);
+ if (field_count == 0)
+ return 0;
+ struct update_map_item *first =
+ update_map_new_item(field, UPDATE_NOP, NULL, 0, data, 0,
+ data_end - data);
+ return first != NULL ? 0 : -1;
+}
+
+int
+update_map_create_with_child(struct update_field *field,
+ const struct update_field *child, const char *key,
+ uint32_t key_len, const char *header)
+{
+ field->type = UPDATE_MAP;
+ field->data = header;
+ stailq_create(&field->map.items);
+
+ const char *pos = header;
+ uint32_t i, ikey_len, field_count = mp_decode_map(&pos);
+ const char *begin = pos;
+ struct update_map_item *item = NULL;
+ for (i = 0; i < field_count; ++i) {
+ if (mp_typeof(*pos) != MP_STR) {
+ mp_next(&pos);
+ mp_next(&pos);
+ continue;
+ }
+ const char *before_key = pos;
+ const char *ikey = mp_decode_str(&pos, &ikey_len);
+ if (ikey_len == key_len && memcmp(ikey, key, key_len) == 0) {
+ item = update_map_new_item(field, UPDATE_NOP, NULL, 0,
+ begin, 0,
+ before_key - begin);
+ if (item == NULL)
+ return -1;
+ ++i;
+ break;
+ }
+ mp_next(&pos);
+ }
+ /*
+ * When a map is initialized with an existing child, it
+ * means, that it was already found earlier. I.e. it is
+ * impossible to miss it.
+ */
+ assert(item != NULL);
+ const char *data = pos;
+ mp_next(&pos);
+ uint32_t data_size = pos - data;
+ for (; i < field_count; ++i) {
+ mp_next(&pos);
+ mp_next(&pos);
+ }
+ item = update_map_item_alloc();
+ if (item == NULL)
+ return -1;
+ item->field = *child;
+ update_map_create_item(field, item, child->type, key, key_len, data,
+ data_size, pos - data - data_size);
+ field->map.size = field_count;
+ field->size = pos - header;
+ return 0;
+}
+
+uint32_t
+update_map_sizeof(struct update_field *field)
+{
+ assert(field->type == UPDATE_MAP);
+ uint32_t res = mp_sizeof_map(field->map.size);
+ struct update_map_item *i;
+ stailq_foreach_entry(i, &field->map.items, in_items) {
+ res += i->tail_size;
+ if (i->key != NULL) {
+ res += mp_sizeof_str(i->key_len) +
+ update_field_sizeof(&i->field);
+ }
+ }
+ return res;
+}
+
+uint32_t
+update_map_store(struct update_field *field, char *out, char *out_end)
+{
+ assert(field->type == UPDATE_MAP);
+ char *out_begin = out;
+ out = mp_encode_map(out, field->map.size);
+ struct update_map_item *i;
+ /*
+ * This is the trick about saving updated keys before
+ * others. The first cycle doesn't save unchanged tails.
+ */
+ stailq_foreach_entry(i, &field->map.items, in_items) {
+ if (i->key != NULL) {
+ out = mp_encode_str(out, i->key, i->key_len);
+ out += update_field_store(&i->field, out, out_end);
+ }
+ }
+ stailq_foreach_entry(i, &field->map.items, in_items) {
+ memcpy(out, i->field.data + i->field.size, i->tail_size);
+ out += i->tail_size;
+ }
+ assert(out <= out_end);
+ return out - out_begin;
+}
diff --git a/src/box/update/update_route.c b/src/box/update/update_route.c
index f1d2ced65..41fcf8b3b 100644
--- a/src/box/update/update_route.c
+++ b/src/box/update/update_route.c
@@ -109,6 +109,36 @@ update_route_branch_array(struct update_field *next_hop,
return op->meta->do_cb(op, next_hop);
}
+/**
+ * Do the actual branch, but by a map and a key in that map. Works
+ * exactly the same as the array-counterpart.
+ */
+static int
+update_route_branch_map(struct update_field *next_hop,
+ const struct update_field *child, const char *key,
+ int key_len, const char *parent)
+{
+ struct update_op *op = child->bar.op;
+ if (child->type != UPDATE_BAR || child->bar.path_len > 0 ||
+ (op->opcode != '!' && op->opcode != '#')) {
+ return update_map_create_with_child(next_hop, child, key,
+ key_len, parent);
+ }
+ op->token_type = JSON_TOKEN_STR;
+ op->key = key;
+ op->key_len = key_len;
+ const char *data = parent;
+ uint32_t field_count = mp_decode_map(&data);
+ const char *end = data;
+ for (uint32_t i = 0; i < field_count; ++i) {
+ mp_next(&end);
+ mp_next(&end);
+ }
+ if (update_map_create(next_hop, parent, data, end, field_count) != 0)
+ return -1;
+ return op->meta->do_cb(op, next_hop);
+}
+
struct update_field *
update_route_branch(struct update_field *field, struct update_op *new_op)
{
@@ -205,19 +235,6 @@ update_route_branch(struct update_field *field, struct update_op *new_op)
} else {
next_hop = field;
}
- if (type == MP_MAP) {
- diag_set(ClientError, ER_UNSUPPORTED, "update",
- "path intersection on map");
- return NULL;
- }
- if (new_token.type != JSON_TOKEN_NUM) {
- update_err(new_op, "can not update array by non-integer index");
- return NULL;
- }
- if (type != MP_ARRAY) {
- update_err_no_such_field(new_op);
- return NULL;
- }
int path_offset = old_path_lexer.offset;
struct update_field child = *field;
@@ -245,11 +262,34 @@ update_route_branch(struct update_field *field, struct update_op *new_op)
*/
}
- new_op->token_type = JSON_TOKEN_NUM;
- new_op->field_no = new_token.num;
- if (update_route_branch_array(next_hop, &child, old_token.num,
- parent) != 0)
+ if (type == MP_ARRAY) {
+ if (new_token.type != JSON_TOKEN_NUM) {
+ update_err(new_op, "can not update array by "\
+ "non-integer index");
+ return NULL;
+ }
+ new_op->token_type = JSON_TOKEN_NUM;
+ new_op->field_no = new_token.num;
+ if (update_route_branch_array(next_hop, &child, old_token.num,
+ parent) != 0)
+ return NULL;
+ } else if (type == MP_MAP) {
+ if (new_token.type != JSON_TOKEN_STR) {
+ update_err(new_op, "can not update map by "\
+ "non-string key");
+ return NULL;
+ }
+ new_op->token_type = JSON_TOKEN_STR;
+ new_op->key = new_token.str;
+ new_op->key_len = new_token.len;
+ if (update_route_branch_map(next_hop, &child, old_token.str,
+ old_token.len, parent) != 0)
+ return NULL;
+ } else {
+ update_err_no_such_field(new_op);
return NULL;
+ }
+
if (! transform_root) {
field->type = UPDATE_ROUTE;
field->route.path = old_path;
diff --git a/test/box/update.result b/test/box/update.result
index 2c5978d8a..4cbee48e6 100644
--- a/test/box/update.result
+++ b/test/box/update.result
@@ -1471,6 +1471,158 @@ t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
---
- error: Field ''[4][3][2]'' was not found in the tuple
...
+--
+-- Intersecting map updates.
+--
+t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [17, 18,
+ 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 2300, 'e': 2400}}, 25]]
+...
+-- Fill an empty map.
+t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
+---
+- [1, {'k': 100, 'v': 200}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14],
+ 15, 16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+t = {} \
+t[1] = 1 \
+t[2] = { \
+ a = 1, \
+ b = 2, \
+ c = {3, 4, 5}, \
+ d = { \
+ [0] = 0, \
+ e = 6, \
+ [true] = false, \
+ f = 7, \
+ [-1] = -1, \
+ }, \
+ g = { \
+ {k = 8, v = 9}, \
+ [box.NULL] = box.NULL, \
+ {k = 10, v = 11}, \
+ {k = '12str', v = '13str'}, \
+ }, \
+ h = { \
+ [-1] = {{{{-1}, -1}, -1}, -1}, \
+ i = { \
+ j = 14, \
+ k = 15, \
+ }, \
+ m = 16, \
+ } \
+}
+---
+...
+t[3] = {}
+---
+...
+t = box.tuple.new(t)
+---
+...
+-- Insert and delete from the same map at once.
+t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
+---
+- [1, {'a': 1, 'n': 17, 'd': {0: 0, -1: -1, true: false, 'f': 7, 'e': 6}, 'h': {'i': {
+ 'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3, 4, 5], 'g': {
+ 1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+ null: null}}, []]
+...
+-- Delete a not existing key.
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
+---
+- [1, {'b': 2, 'a': 1, 'd': {0: 0, 'f': 7, 'e': 6, -1: -1, true: false, 'g': 6000},
+ 'h': {'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3,
+ 4, 5], 'g': {1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+ null: null}}, []]
+...
+-- Scalar operations.
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
+---
+- [1, {'b': 3, 'c': [3, 4, 5], 'd': {0: 0, true: false, 'f': 8, -1: -1, 'e': 7}, 'h': {
+ 'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'a': 1, 'g': {
+ 1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+ null: null}}, []]
+...
+-- Errors.
+-- This operation array creates an update tree with several full
+-- featured maps, not bars. It is required to cover some complex
+-- cases about fields extraction.
+ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
+---
+...
+-- When a map update extracts a next token on demand, it should
+-- reject bad json, obviously. Scalar and '!'/'#' operations are
+-- implemented separately and tested both.
+ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+ops[#ops][1] = '!'
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+ops[#ops][1] = '#'
+---
+...
+ops[#ops][3] = 1
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+-- Key extractor should reject any attempt to use non-string keys.
+ops[#ops] = {'-', '[2].h.i[20]', -1}
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
+ key'
+...
+ops[#ops][1] = '='
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
+ key'
+...
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
+---
+- error: Field ''[2].d.k'' was not found in the tuple
+...
+t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
+---
+- error: Field ''[2].d.g.h'' was not found in the tuple
+...
+t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
+---
+- error: 'Field ''[2].d.e'' UPDATE error: the key exists already'
+...
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
+---
+- error: 'Field ''[2].d.old'' UPDATE error: can delete only 1 field from a map in
+ a row'
+...
+-- '!'/'=' can be used to create a field, but only if it is a
+-- tail. These operations can't create the whole path.
+t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
+---
+- error: Field ''[2].d.new.new'' was not found in the tuple
+...
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
+---
+- error: Field ''[2].d.old.old'' was not found in the tuple
+...
s:drop()
---
...
diff --git a/test/box/update.test.lua b/test/box/update.test.lua
index d91a2e95c..6fed818f4 100644
--- a/test/box/update.test.lua
+++ b/test/box/update.test.lua
@@ -515,4 +515,79 @@ t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]',
-- during branching.
t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
+--
+-- Intersecting map updates.
+--
+t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
+-- Fill an empty map.
+t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
+
+t = {} \
+t[1] = 1 \
+t[2] = { \
+ a = 1, \
+ b = 2, \
+ c = {3, 4, 5}, \
+ d = { \
+ [0] = 0, \
+ e = 6, \
+ [true] = false, \
+ f = 7, \
+ [-1] = -1, \
+ }, \
+ g = { \
+ {k = 8, v = 9}, \
+ [box.NULL] = box.NULL, \
+ {k = 10, v = 11}, \
+ {k = '12str', v = '13str'}, \
+ }, \
+ h = { \
+ [-1] = {{{{-1}, -1}, -1}, -1}, \
+ i = { \
+ j = 14, \
+ k = 15, \
+ }, \
+ m = 16, \
+ } \
+}
+t[3] = {}
+t = box.tuple.new(t)
+-- Insert and delete from the same map at once.
+t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
+-- Delete a not existing key.
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
+
+-- Scalar operations.
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
+
+-- Errors.
+-- This operation array creates an update tree with several full
+-- featured maps, not bars. It is required to cover some complex
+-- cases about fields extraction.
+ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
+-- When a map update extracts a next token on demand, it should
+-- reject bad json, obviously. Scalar and '!'/'#' operations are
+-- implemented separately and tested both.
+ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
+t:update(ops)
+ops[#ops][1] = '!'
+t:update(ops)
+ops[#ops][1] = '#'
+ops[#ops][3] = 1
+t:update(ops)
+-- Key extractor should reject any attempt to use non-string keys.
+ops[#ops] = {'-', '[2].h.i[20]', -1}
+t:update(ops)
+ops[#ops][1] = '='
+t:update(ops)
+
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
+t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
+t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
+-- '!'/'=' can be used to create a field, but only if it is a
+-- tail. These operations can't create the whole path.
+t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
+
s:drop()
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 11+ messages in thread