[Tarantool-patches] [PATCH 0/3] JSON update

Tarantool development patches archive
 help / color / mirror / Atom feed

* [Tarantool-patches] [PATCH 0/3] JSON update
@ 2019-12-23 22:41 Vladislav Shpilevoy
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable Vladislav Shpilevoy
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-23 22:41 UTC (permalink / raw)
  To: tarantool-patches

The patchset finishes JSON update feature by allowance of making
multiple update operations with the same prefix in one update().

The patchset consists of relatively independent self-explaining
parts. First part is a preparation for having not just a set of
updates, but for having an update tree, with number and string
keys.

Second part makes it possible to do multiple update operations
having the same prefix if it ends on an array index.

The final part introduces map updates, documentation request, and
finishes the feature.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-1261-json-update
Issue: https://github.com/tarantool/tarantool/issues/1261

Vladislav Shpilevoy (3):
  tuple: make update operation tokens consumable
  tuple: JSON path update intersection at arrays
  tuple: JSON path update intersection at maps

 src/box/CMakeLists.txt      |   2 +
 src/box/xrow_update_array.c |  97 +++++++-
 src/box/xrow_update_bar.c   |  14 +-
 src/box/xrow_update_field.c |  30 +++
 src/box/xrow_update_field.h | 209 ++++++++++++++++-
 src/box/xrow_update_map.c   | 453 ++++++++++++++++++++++++++++++++++++
 src/box/xrow_update_route.c | 387 ++++++++++++++++++++++++++++++
 test/box/update.result      | 329 ++++++++++++++++++++++++++
 test/box/update.test.lua    | 166 +++++++++++++
 9 files changed, 1672 insertions(+), 15 deletions(-)
 create mode 100644 src/box/xrow_update_map.c
 create mode 100644 src/box/xrow_update_route.c

-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable
  2019-12-23 22:41 [Tarantool-patches] [PATCH 0/3] JSON update Vladislav Shpilevoy
@ 2019-12-23 22:41 ` Vladislav Shpilevoy
  2019-12-24 16:15   ` Vladislav Shpilevoy
  2019-12-26 12:07   ` Sergey Ostanevich
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 2/3] tuple: JSON path update intersection at arrays Vladislav Shpilevoy
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-23 22:41 UTC (permalink / raw)
  To: tarantool-patches

There is a case: [1][2][3][4] = 100. It is not a problem when it
is a single operation, not intersecting with anything. It is an
isolated update then, and works ok. But the next patch allows
several update operations have the same prefix, and the path
[1][2][3][4] can become a tree of updated arrays. For example, a
trivial tree like this:

    root: [ [1] ]
             |
             [ [1] [2] ]
                    |
                    [ [1] [2] [3] ]
                               |
                               [ [1] [2] [3] [4] ]
                                             =100

When the update is applied to root, the JSON path [1][2][3][4]
is decoded one part by one. And the operation goes down the tree
until reaches the leaf, where [4] = 100 is applied. Each time when
the update goes one level down, somebody should update
xrow_update_op.field_no so as on the first level it would be 1,
then 2, 3, 4.

Does it mean that each level of the update [1][2][3][4] should
prepare field_no for the next child? No, because then they would
need to check type of the child if it is an array or map, or
whatever expects a valid field_no/key in xrow_update_op, and
ensure that map-child gets a key, array-child gets a field_no.
That would complicate the code to a totally unreadable
state, and would break encapsulation between
xrow_update_array/map/bar... . Each array update operation would
check a child for all existing types to ensure that the next token
matches it. The same would happen to map updates.

This patch goes another way - let each level of update check if
its field_no/key is already prepared by the caller. And if not,
extract a next token from the operation path. So the map update
will ensure that it has a valid key, an array update will ensure
that it has a valid field no.

Part of #1261
---
 src/box/xrow_update_array.c | 47 +++++++++++++++++++++++++++++++++++--
 src/box/xrow_update_field.c | 22 +++++++++++++++++
 src/box/xrow_update_field.h | 25 ++++++++++++++++++--
 3 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
index 57427e39c..6a7ce09ff 100644
--- a/src/box/xrow_update_array.c
+++ b/src/box/xrow_update_array.c
@@ -32,6 +32,31 @@
 #include "msgpuck.h"
 #include "fiber.h"
 
+/**
+ * Make sure @a op contains a valid field number to where the
+ * operation should be applied next. Field number may be not
+ * known, if the array's parent didn't propagate operation's
+ * lexer. In fact, the parent fills fieldno only in some rare
+ * cases like branching. Generally, an array should care about
+ * fieldno by itself.
+ */
+static inline int
+xrow_update_op_prepare_num_token(struct xrow_update_op *op)
+{
+	/*
+	 * Token type END is a special value meaning that the
+	 * current token needs to be parsed.
+	 */
+	if (op->token_type == JSON_TOKEN_END &&
+	    xrow_update_op_consume_token(op) != 0)
+			return -1;
+	if (op->token_type != JSON_TOKEN_NUM) {
+		return xrow_update_err(op, "can't update an array by not a "\
+				       "number index");
+	}
+	return 0;
+}
+
 /**
  * Make field index non-negative and check for the field
  * existence.
@@ -39,6 +64,7 @@
 static inline int
 xrow_update_op_adjust_field_no(struct xrow_update_op *op, int32_t field_count)
 {
+	assert(op->token_type == JSON_TOKEN_NUM);
 	if (op->field_no >= 0) {
 		if (op->field_no < field_count)
 			return 0;
@@ -221,10 +247,14 @@ xrow_update_op_do_array_insert(struct xrow_update_op *op,
 {
 	assert(field->type == XUPDATE_ARRAY);
 	struct xrow_update_array_item *item;
+	if (xrow_update_op_prepare_num_token(op) != 0)
+		return -1;
+
 	if (!xrow_update_op_is_term(op)) {
 		item = xrow_update_array_extract_item(field, op);
 		if (item == NULL)
 			return -1;
+		op->token_type = JSON_TOKEN_END;
 		return xrow_update_op_do_field_insert(op, &item->field);
 	}
 
@@ -248,6 +278,9 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
 {
 	assert(field->type == XUPDATE_ARRAY);
 	struct xrow_update_rope *rope = field->array.rope;
+	if (xrow_update_op_prepare_num_token(op) != 0)
+		return -1;
+
 	/* Interpret '=' for n + 1 field as insert. */
 	if (op->field_no == (int32_t) xrow_update_rope_size(rope))
 		return xrow_update_op_do_array_insert(op, field);
@@ -256,8 +289,10 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
 		xrow_update_array_extract_item(field, op);
 	if (item == NULL)
 		return -1;
-	if (!xrow_update_op_is_term(op))
+	if (!xrow_update_op_is_term(op)) {
+		op->token_type = JSON_TOKEN_END;
 		return xrow_update_op_do_field_set(op, &item->field);
+	}
 	op->new_field_len = op->arg.set.length;
 	/*
 	 * Ignore the previous op, if any. It is not correct,
@@ -275,11 +310,15 @@ xrow_update_op_do_array_delete(struct xrow_update_op *op,
 			       struct xrow_update_field *field)
 {
 	assert(field->type == XUPDATE_ARRAY);
+	if (xrow_update_op_prepare_num_token(op) != 0)
+		return -1;
+
 	if (!xrow_update_op_is_term(op)) {
 		struct xrow_update_array_item *item =
 			xrow_update_array_extract_item(field, op);
 		if (item == NULL)
 			return -1;
+		op->token_type = JSON_TOKEN_END;
 		return xrow_update_op_do_field_delete(op, &item->field);
 	}
 
@@ -301,12 +340,16 @@ int										\
 xrow_update_op_do_array_##op_type(struct xrow_update_op *op,			\
 				  struct xrow_update_field *field)		\
 {										\
+	if (xrow_update_op_prepare_num_token(op) != 0)				\
+		return -1;							\
 	struct xrow_update_array_item *item =					\
 		xrow_update_array_extract_item(field, op);			\
 	if (item == NULL)							\
 		return -1;							\
-	if (!xrow_update_op_is_term(op))					\
+	if (!xrow_update_op_is_term(op)) {					\
+		op->token_type = JSON_TOKEN_END;				\
 		return xrow_update_op_do_field_##op_type(op, &item->field);	\
+	}									\
 	if (item->field.type != XUPDATE_NOP)					\
 		return xrow_update_err_double(op);				\
 	if (xrow_update_op_do_##op_type(op, item->field.data) != 0)		\
diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
index de865a21d..96fcaf747 100644
--- a/src/box/xrow_update_field.c
+++ b/src/box/xrow_update_field.c
@@ -611,6 +611,22 @@ xrow_update_op_by(char opcode)
 	}
 }
 
+int
+xrow_update_op_consume_token(struct xrow_update_op *op)
+{
+	struct json_token token;
+	int rc = json_lexer_next_token(&op->lexer, &token);
+	if (rc != 0)
+		return xrow_update_err_bad_json(op, rc);
+	if (token.type == JSON_TOKEN_END)
+		return xrow_update_err_no_such_field(op);
+	op->token_type = token.type;
+	op->key = token.str;
+	op->key_len = token.len;
+	op->field_no = token.num;
+	return 0;
+}
+
 int
 xrow_update_op_decode(struct xrow_update_op *op, int index_base,
 		      struct tuple_dictionary *dict, const char **expr)
@@ -639,6 +655,12 @@ xrow_update_op_decode(struct xrow_update_op *op, int index_base,
 		diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
 		return -1;
 	}
+	/*
+	 * First token is always num. Even if a user specified a
+	 * field name it is converted to num by the tuple
+	 * dictionary.
+	 */
+	op->token_type = JSON_TOKEN_NUM;
 	int32_t field_no = 0;
 	switch(mp_typeof(**expr)) {
 	case MP_INT:
diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
index bda9222cc..fbaf45c5d 100644
--- a/src/box/xrow_update_field.h
+++ b/src/box/xrow_update_field.h
@@ -179,8 +179,18 @@ struct xrow_update_op {
 	const struct xrow_update_op_meta *meta;
 	/** Operation arguments. */
 	union xrow_update_arg arg;
-	/** First level field no. */
-	int32_t field_no;
+	/**
+	 * Current level token. END means that it is invalid and
+	 * a next token should be extracted from the lexer.
+	 */
+	enum json_token_type token_type;
+	union {
+		struct {
+			const char *key;
+			uint32_t key_len;
+		};
+		int32_t field_no;
+	};
 	/** Size of a new field after it is updated. */
 	uint32_t new_field_len;
 	/** Opcode symbol: = + - / ... */
@@ -193,6 +203,17 @@ struct xrow_update_op {
 	struct json_lexer lexer;
 };
 
+/**
+ * Extract a next token from the operation path lexer. The result
+ * is used to decide to which child of a current map/array the
+ * operation should be forwarded. It is not just a synonym to
+ * json_lexer_next_token, because fills some fields of @a op,
+ * and should be used only to chose a next child inside a current
+ * map/array.
+ */
+int
+xrow_update_op_consume_token(struct xrow_update_op *op);
+
 /**
  * Decode an update operation from MessagePack.
  * @param[out] op Update operation.
-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Tarantool-patches] [PATCH 2/3] tuple: JSON path update intersection at arrays
  2019-12-23 22:41 [Tarantool-patches] [PATCH 0/3] JSON update Vladislav Shpilevoy
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable Vladislav Shpilevoy
@ 2019-12-23 22:41 ` Vladislav Shpilevoy
  2019-12-27 14:13   ` Sergey Ostanevich
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps Vladislav Shpilevoy
  2019-12-30  5:26 ` [Tarantool-patches] [PATCH 0/3] JSON update Kirill Yukhin
  3 siblings, 1 reply; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-23 22:41 UTC (permalink / raw)
  To: tarantool-patches

Before the patch only isolated JSON updates were supported as the
simplest and fastest to implement. This patch allows update
operations with paths having the same prefix. But difference of
the paths should start from an array index.

For example, this is allowed:

    [1][2][3].a.b.c = 20
    [1][2][4].e.f.g = 30

    First difference is [3] vs [4] - intersection by an array, ok.

This is not allowed yet:

    [1][2][3].a.b.c = 20
    [1][2][3].a.e.f = 30

    First difference is 'b' vs 'e' - intersection by a map,
    not ok.

For that a new update tree node type is added - XUPDATE_ROUTE.
When several update operations have the same prefix, this prefix
becomes an XUPDATE_ROUTE tree field. It stores the prefix and a
subtree with these operations.

Bar and route update nodes can branch and produce more bars and
routes, when new operations come.

Part of #1261
---
 src/box/CMakeLists.txt      |   1 +
 src/box/xrow_update_array.c |  50 ++++++
 src/box/xrow_update_bar.c   |  14 +-
 src/box/xrow_update_field.c |   4 +
 src/box/xrow_update_field.h |  99 +++++++++++
 src/box/xrow_update_route.c | 346 ++++++++++++++++++++++++++++++++++++
 test/box/update.result      | 177 ++++++++++++++++++
 test/box/update.test.lua    |  91 ++++++++++
 8 files changed, 772 insertions(+), 10 deletions(-)
 create mode 100644 src/box/xrow_update_route.c

diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index fc9d1a3e8..f99efc503 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -44,6 +44,7 @@ add_library(tuple STATIC
     xrow_update_field.c
     xrow_update_array.c
     xrow_update_bar.c
+    xrow_update_route.c
     tuple_compare.cc
     tuple_extract_key.cc
     tuple_hash.cc
diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
index 6a7ce09ff..e49dad5c1 100644
--- a/src/box/xrow_update_array.c
+++ b/src/box/xrow_update_array.c
@@ -195,6 +195,56 @@ xrow_update_array_create(struct xrow_update_field *field, const char *header,
 	return xrow_update_rope_append(field->array.rope, item, field_count);
 }
 
+int
+xrow_update_array_create_with_child(struct xrow_update_field *field,
+				    const char *header,
+				    const struct xrow_update_field *child,
+				    int32_t field_no)
+{
+	const char *data = header;
+	uint32_t field_count = mp_decode_array(&data);
+	const char *first_field = data;
+	const char *first_field_end = first_field;
+	mp_next(&first_field_end);
+	struct region *region = &fiber()->gc;
+	struct xrow_update_rope *rope = xrow_update_rope_new(region);
+	if (rope == NULL)
+		return -1;
+	struct xrow_update_array_item *item = (struct xrow_update_array_item *)
+		xrow_update_alloc(region, sizeof(*item));
+	if (item == NULL)
+		return -1;
+	const char *end = first_field_end;
+	if (field_no > 0) {
+		for (int32_t i = 1; i < field_no; ++i)
+			mp_next(&end);
+		xrow_update_array_item_create(item, XUPDATE_NOP, first_field,
+					      first_field_end - first_field,
+					      end - first_field_end);
+		if (xrow_update_rope_append(rope, item, field_no) != 0)
+			return -1;
+		item = (struct xrow_update_array_item *)
+			xrow_update_alloc(region, sizeof(*item));
+		if (item == NULL)
+			return -1;
+		first_field = end;
+		first_field_end = first_field;
+		mp_next(&first_field_end);
+		end = first_field_end;
+	}
+	for (uint32_t i = field_no + 1; i < field_count; ++i)
+		mp_next(&end);
+	item->field = *child;
+	xrow_update_array_item_create(item, child->type, first_field,
+				      first_field_end - first_field,
+				      end - first_field_end);
+	field->type = XUPDATE_ARRAY;
+	field->data = header;
+	field->size = end - header;
+	field->array.rope = rope;
+	return xrow_update_rope_append(rope, item, field_count - field_no);
+}
+
 uint32_t
 xrow_update_array_sizeof(struct xrow_update_field *field)
 {
diff --git a/src/box/xrow_update_bar.c b/src/box/xrow_update_bar.c
index 7285c7e2b..1c2d2ef99 100644
--- a/src/box/xrow_update_bar.c
+++ b/src/box/xrow_update_bar.c
@@ -307,17 +307,11 @@ int										\
 xrow_update_op_do_bar_##op_type(struct xrow_update_op *op,			\
 				struct xrow_update_field *field)		\
 {										\
-	/*									\
-	 * The only way to update a bar is to make a second update		\
-	 * with the same prefix as this bar. But it is not			\
-	 * supported yet.							\
-	 */									\
-	(void) op;								\
-	(void) field;								\
 	assert(field->type == XUPDATE_BAR);					\
-	diag_set(ClientError, ER_UNSUPPORTED, "update",				\
-		 "intersected JSON paths");					\
-	return -1;								\
+	field = xrow_update_route_branch(field, op);				\
+	if (field == NULL)							\
+		return -1;							\
+	return xrow_update_op_do_field_##op_type(op, field);			\
 }
 
 DO_BAR_OP_GENERIC(insert)
diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
index 96fcaf747..3a4aa6c90 100644
--- a/src/box/xrow_update_field.c
+++ b/src/box/xrow_update_field.c
@@ -114,6 +114,8 @@ xrow_update_field_sizeof(struct xrow_update_field *field)
 		return xrow_update_array_sizeof(field);
 	case XUPDATE_BAR:
 		return xrow_update_bar_sizeof(field);
+	case XUPDATE_ROUTE:
+		return xrow_update_route_sizeof(field);
 	default:
 		unreachable();
 	}
@@ -141,6 +143,8 @@ xrow_update_field_store(struct xrow_update_field *field, char *out,
 		return xrow_update_array_store(field, out, out_end);
 	case XUPDATE_BAR:
 		return xrow_update_bar_store(field, out, out_end);
+	case XUPDATE_ROUTE:
+		return xrow_update_route_store(field, out, out_end);
 	default:
 		unreachable();
 	}
diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
index fbaf45c5d..f07b5a649 100644
--- a/src/box/xrow_update_field.h
+++ b/src/box/xrow_update_field.h
@@ -275,6 +275,23 @@ enum xrow_update_type {
 	 * path nodes. And this is the most common case.
 	 */
 	XUPDATE_BAR,
+	/**
+	 * Field with a subtree of updates having the same prefix
+	 * stored here explicitly. New updates with the same
+	 * prefix just follow it without decoding of JSON nor
+	 * MessagePack. It can be quite helpful when an update
+	 * works with the same internal object via several
+	 * operations like this:
+	 *
+	 *    [1][2].a.b.c[1] = 20
+	 *    [1][2].a.b.c[2] = 30
+	 *    [1][2].a.b.c[3] = true
+	 *
+	 * Here [1][2].a.b.c is stored only once as a route with a
+	 * child XUPDATE_ARRAY making updates '[1] = 20',
+	 * '[2] = 30', '[3] = true'.
+	 */
+	XUPDATE_ROUTE,
 };
 
 /**
@@ -355,6 +372,17 @@ struct xrow_update_field {
 				};
 			};
 		} bar;
+		/** Route update - path to an update subtree. */
+		struct {
+			/**
+			 * Common prefix of all updates in the
+			 * subtree.
+			 */
+			const char *path;
+			int path_len;
+			/** Update subtree. */
+			struct xrow_update_field *next_hop;
+		} route;
 	};
 };
 
@@ -447,6 +475,33 @@ xrow_update_array_create(struct xrow_update_field *field, const char *header,
 			 const char *data, const char *data_end,
 			 uint32_t field_count);
 
+/**
+ * The same as mere create, but the array is created with a first
+ * child right away. It allows to make it more efficient, because
+ * under the hood a rope is created not as a one huge range which
+ * is then spit in parts, but as two rope nodes from the
+ * beginning. On the summary: -1 rope node split, -1 decoding of
+ * fields from 0 to @a field_no.
+ *
+ * The function is used during branching, where there was an
+ * existing update, but another one came with the same prefix, and
+ * a different suffix.
+ *
+ * @param[out] field Field to initialize.
+ * @param header Header of the MessagePack array.
+ * @param child A child subtree. The child is copied by value into
+ *        the created array.
+ * @param field_no Field number of @a child.
+ *
+ * @retval  0 Success.
+ * @retval -1 Error.
+ */
+int
+xrow_update_array_create_with_child(struct xrow_update_field *field,
+				    const char *header,
+				    const struct xrow_update_field *child,
+				    int32_t field_no);
+
 OP_DECL_GENERIC(array)
 
 /* }}} xrow_update_field.array */
@@ -463,6 +518,37 @@ OP_DECL_GENERIC(nop)
 
 /* }}} update_field.nop */
 
+/* {{{ xrow_update_field.route */
+
+/**
+ * Take a bar or a route @a field and split its path in the place
+ * where @a new_op should be applied. Prefix becomes a new route
+ * object, suffix becomes a child of the result route. In the
+ * result @a field stays root of its subtree, and a node of that
+ * subtree is returned, to which @a new_op should be applied.
+ *
+ * Note, this function does not apply @a new_op. It just finds to
+ * where it *should be* applied and does all preparations. It is
+ * done deliberately, because otherwise do_cb() virtual function
+ * of @a new_op would have been called here, since there is no
+ * context. But a caller always knows exactly if it was insert,
+ * set, arith, etc. And a caller can and does use more specific
+ * function like xrow_update_op_do_field_set/insert/... .
+ *
+ * @param field Field to find where to apply @a new_op.
+ * @param new_op New operation to apply.
+ *
+ * @retval not-NULL A field to which @a new_op should be applied.
+ * @retval NULL Error.
+ */
+struct xrow_update_field *
+xrow_update_route_branch(struct xrow_update_field *field,
+			 struct xrow_update_op *new_op);
+
+OP_DECL_GENERIC(route)
+
+/* }}} xrow_update_field.route */
+
 #undef OP_DECL_GENERIC
 
 /* {{{ Common helpers. */
@@ -475,6 +561,17 @@ OP_DECL_GENERIC(nop)
  * xrow_update_op_do_field_set on its childs. Each child can be
  * another array, a bar, a route, a map - anything. The functions
  * below help to make such places shorter and simpler.
+ *
+ * Note, that they are recursive, although it is not clearly
+ * visible. For example, if an update tree contains several array
+ * nodes on one tree branch, then update of the deepest array goes
+ * through each of these nodes and calls
+ * xrow_update_op_do_array_<opname>() on needed children. But it
+ * is ok, because operation count is usually small (<<50 in the
+ * most cases, usually <= 5), and the update tree depth is not
+ * bigger than operation count. Also, fiber stack is big enough to
+ * fit ~10k update tree depth - incredible number, even though the
+ * real limit is 4k due to limited number of operations.
  */
 #define OP_DECL_GENERIC(op_type)						\
 static inline int								\
@@ -488,6 +585,8 @@ xrow_update_op_do_field_##op_type(struct xrow_update_op *op,			\
 		return xrow_update_op_do_nop_##op_type(op, field);		\
 	case XUPDATE_BAR:							\
 		return xrow_update_op_do_bar_##op_type(op, field);		\
+	case XUPDATE_ROUTE:							\
+		return xrow_update_op_do_route_##op_type(op, field);		\
 	default:								\
 		unreachable();							\
 	}									\
diff --git a/src/box/xrow_update_route.c b/src/box/xrow_update_route.c
new file mode 100644
index 000000000..f374db837
--- /dev/null
+++ b/src/box/xrow_update_route.c
@@ -0,0 +1,346 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "xrow_update_field.h"
+#include "fiber.h"
+#include "tuple.h"
+
+/**
+ * Do the actual branch. This is the case when an existing
+ * bar/route path is different from a new operation's path in an
+ * array. The existing object needs to be split into parent-child,
+ * and the new operation becomes a second child.
+ *
+ * @param[out] next_hop A field which will be initialized as an
+ *        array, and which will be a point to apply a new
+ *        operation.
+ * @param parent MessagePack array to be taken by @a next_hop.
+ * @param child Current field from which the branch happens. It
+ *        already contains an update subtree.
+ */
+static int
+xrow_update_route_branch_array(struct xrow_update_field *next_hop,
+			       const char *parent,
+			       const struct xrow_update_field *child,
+			       int32_t field_no)
+{
+	struct xrow_update_op *op = child->bar.op;
+	/*
+	 * There are rules when a subtree can be just copied as is
+	 * from one parent to another.
+	 * 1) It should not be a bar update. Because if it is not
+	 *    a bar, then it is either scalar or an array/map.
+	 *    Scalar update can be safely moved. Array/map update
+	 *    doesn't change their parent, and also can be moved.
+	 *    Otherwise see (2).
+	 * 2) It is a bar. Then it should not be a leaf. If it is
+	 *    not a leaf, then it does not change header and other
+	 *    fields of this particular array, and can be safely
+	 *    moved to somewhere else.
+	 *    Otherwise see (1).
+	 * 3) Ok, it is a bar, a leaf. Then it is a bar with zero
+	 *    path length. It could degrade to zero path len in
+	 *    during branching. In this case it should be a scalar
+	 *    bar. The only non-scalar operations are '!' and '#'.
+	 *
+	 * Why '#' and '!' can't be moved? '!', for example, being
+	 * applied to a field [1], affects all fields [2-*], and
+	 * the array header. The same but in an even worse form
+	 * about '#'. Such operations should be redone. They
+	 * affect many fields and the parent.
+	 *
+	 * There is a tricky thing though - why not to just redo
+	 * all operations here, for the sake of code simplicity?
+	 * It would allow to remove 'create_with_child' crutch.
+	 * The answer is - it is not possible. If a field is
+	 * copyable, it is not re-applicable. And vice-versa. For
+	 * example, if it is not a leaf, then there may be many
+	 * operations, not one. A subtree just can't be
+	 * 're-applied'.
+	 *
+	 * If the operation is scalar and a leaf, then its result
+	 * has already overridden its arguments. This is because
+	 * scalar operations save result into the arguments, to
+	 * save memory. A second operation appliance would lead
+	 * to very surprising results.
+	 *
+	 * Another reason - performance. This path should be
+	 * quite hot, and to copy a struct is for sure much faster
+	 * than to reapply an operation using a virtual function.
+	 * Operations '!' and '#' are quite rare, so their
+	 * optimization is not a critical goal.
+	 */
+	if (/* (1) Not a bar. */
+	    child->type != XUPDATE_BAR ||
+	    /* (2) Bar, but not a leaf. */
+	    child->bar.path_len > 0 ||
+	    /* (3) Leaf, bar, but a scalar operation. */
+	    (op->opcode != '!' && op->opcode != '#')) {
+		return xrow_update_array_create_with_child(next_hop, parent,
+							   child, field_no);
+	}
+	/*
+	 * Can't move the child. The only way to branch is to
+	 * reapply the child's operation.
+	 */
+	op->token_type = JSON_TOKEN_NUM;
+	op->field_no = field_no;
+	const char *data = parent;
+	uint32_t field_count = mp_decode_array(&data);
+	const char *end = data;
+	for (uint32_t i = 0; i < field_count; ++i)
+		mp_next(&end);
+	if (xrow_update_array_create(next_hop, parent, data, end,
+				     field_count) != 0)
+		return -1;
+	return op->meta->do_op(op, next_hop);
+}
+
+struct xrow_update_field *
+xrow_update_route_branch(struct xrow_update_field *field,
+			 struct xrow_update_op *new_op)
+{
+	assert(new_op->lexer.src != NULL);
+	const char *old_path;
+	int old_path_len;
+	if (field->type == XUPDATE_BAR) {
+		old_path = field->bar.path;
+		old_path_len = field->bar.path_len;
+	} else {
+		assert(field->type == XUPDATE_ROUTE);
+		old_path = field->route.path;
+		old_path_len = field->route.path_len;
+	}
+	assert(old_path != NULL);
+	struct json_lexer old_path_lexer;
+	struct json_token old_token, new_token;
+	/*
+	 * Offset is going to be used as a length of the route
+	 * node created as a parent of the old subtree and the
+	 * new operation. As it is described in the route
+	 * definition in struct xrow_update_field - route is the
+	 * common prefix of all operations of the subtree. Here
+	 * length of that prefix is calculated.
+	 * In addition, it is used here to determine if the new
+	 * operation is different from the current subtree from
+	 * the very beginning. Basically, it means the offset is
+	 * 0, and no route is generated. Root becomes a regular
+	 * update field (array, map), not a route. Example:
+	 * @a field is a bar '[1].a.b = 20', @a new_op is
+	 * '[2].c.d = 30'. In this case the paths are already
+	 * different from the beginning, no common prefix = no
+	 * route. Array with children [1].a.b and [2].c.d becomes
+	 * a root.
+	 */
+	int saved_old_offset;
+	json_lexer_create(&old_path_lexer, old_path, old_path_len,
+			  TUPLE_INDEX_BASE);
+	const char *parent = field->data;
+	do {
+		saved_old_offset = old_path_lexer.offset;
+		int rc = json_lexer_next_token(&old_path_lexer, &old_token);
+		/* Old path is already validated. */
+		assert(rc == 0);
+		rc = json_lexer_next_token(&new_op->lexer, &new_token);
+		if (rc != 0) {
+			xrow_update_err_bad_json(new_op, rc);
+			return NULL;
+		}
+		if (json_token_cmp(&old_token, &new_token) != 0)
+			break;
+		switch(new_token.type) {
+		case JSON_TOKEN_NUM:
+			rc = tuple_field_go_to_index(&parent, new_token.num);
+			break;
+		case JSON_TOKEN_STR:
+			rc = tuple_field_go_to_key(&parent, new_token.str,
+						   new_token.len);
+			break;
+		default:
+			/*
+			 * Can't be JSON_TOKEN_ANY, because old
+			 * and new tokens are equal, but '*' is
+			 * considered invalid and the old was
+			 * already checked for that. So the new is
+			 * valid too. And can't have type ANY.
+			 */
+			assert(new_token.type == JSON_TOKEN_END);
+			xrow_update_err_double(new_op);
+			return NULL;
+		}
+		/*
+		 * Must always find a field, because the old
+		 * token already went that path.
+		 */
+		assert(rc == 0);
+	} while (true);
+	enum mp_type type = mp_typeof(*parent);
+	/*
+	 * Ched if the paths are different from the very
+	 * beginning. It means, that the old field should be
+	 * transformed instead of creating a new route node.
+	 */
+	bool transform_root = saved_old_offset == 0;
+	struct xrow_update_field *next_hop;
+	if (! transform_root) {
+		next_hop = (struct xrow_update_field *)
+			region_alloc(&fiber()->gc, sizeof(*next_hop));
+		if (next_hop == NULL) {
+			diag_set(OutOfMemory, sizeof(*next_hop), "region_alloc",
+				 "next_hop");
+			return NULL;
+		}
+	} else {
+		next_hop = field;
+	}
+
+	int path_offset = old_path_lexer.offset;
+	struct xrow_update_field child = *field;
+	if (child.type == XUPDATE_ROUTE) {
+		child.route.path += path_offset;
+		child.route.path_len -= path_offset;
+		if (child.route.path_len == 0)
+			child = *child.route.next_hop;
+	} else {
+		assert(child.type == XUPDATE_BAR);
+		child.bar.path += path_offset;
+		child.bar.path_len -= path_offset;
+		/*
+		 * Yeah, bar length can become 0 here, but it is
+		 * ok as long as it is a scalar operation (not '!'
+		 * and not '#'). When a bar is scalar, it operates
+		 * on one concrete field and works even if its
+		 * path len is 0. Talking of '#' and '!' - they
+		 * are handled by array and map 'branchers'
+		 * internally, below. They reapply such
+		 * operations.
+		 */
+	}
+
+	if (type == MP_ARRAY) {
+		if (new_token.type != JSON_TOKEN_NUM) {
+			xrow_update_err(new_op, "can not update array by "\
+					"non-integer index");
+			return NULL;
+		}
+		new_op->token_type = JSON_TOKEN_NUM;
+		new_op->field_no = new_token.num;
+		if (xrow_update_route_branch_array(next_hop, parent, &child,
+						   old_token.num) != 0)
+			return NULL;
+	} else if (type == MP_MAP) {
+		diag_set(ClientError, ER_UNSUPPORTED, "update",
+			 "path intersection on map");
+		return NULL;
+	} else {
+		xrow_update_err_no_such_field(new_op);
+		return NULL;
+	}
+
+	if (! transform_root) {
+		field->type = XUPDATE_ROUTE;
+		field->route.path = old_path;
+		field->route.path_len = saved_old_offset;
+		field->route.next_hop = next_hop;
+	}
+	return next_hop;
+}
+
+/**
+ * Obtain a next node of the update tree to which @a op should be
+ * propagated. It is the same as branch, but has a fast path in
+ * case @a field is route, and operation prefix matches this
+ * route - then no need to parse JSON and dive into MessagePack,
+ * the route is just followed, via a lexer offset increase.
+ */
+static struct xrow_update_field *
+xrow_update_route_next(struct xrow_update_field *field, struct xrow_update_op *op)
+{
+	assert(field->type == XUPDATE_ROUTE);
+	assert(! xrow_update_op_is_term(op));
+	const char *new_path = op->lexer.src + op->lexer.offset;
+	int new_path_len = op->lexer.src_len - op->lexer.offset;
+	if (field->route.path_len <= new_path_len &&
+	    memcmp(field->route.path, new_path, field->route.path_len) == 0) {
+		/*
+		 * Fast path: jump to the next hop with no
+		 * decoding. Is used, when several JSON updates
+		 * have the same prefix.
+		 */
+		op->lexer.offset += field->route.path_len;
+		return field->route.next_hop;
+	}
+	return xrow_update_route_branch(field, op);
+}
+
+#define DO_SCALAR_OP_GENERIC(op_type)						\
+int										\
+xrow_update_op_do_route_##op_type(struct xrow_update_op *op,			\
+				  struct xrow_update_field *field)		\
+{										\
+	assert(field->type == XUPDATE_ROUTE);					\
+	struct xrow_update_field *next_hop = xrow_update_route_next(field, op);	\
+	if (next_hop == NULL)							\
+		return -1;							\
+	return xrow_update_op_do_field_##op_type(op, next_hop);					\
+}
+
+DO_SCALAR_OP_GENERIC(set)
+
+DO_SCALAR_OP_GENERIC(insert)
+
+DO_SCALAR_OP_GENERIC(delete)
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
+
+uint32_t
+xrow_update_route_sizeof(struct xrow_update_field *field)
+{
+	return field->size - field->route.next_hop->size +
+	       xrow_update_field_sizeof(field->route.next_hop);
+}
+
+uint32_t
+xrow_update_route_store(struct xrow_update_field *field, char *out,
+			char *out_end)
+{
+	char *saved_out = out;
+	int before_hop = field->route.next_hop->data - field->data;
+	memcpy(out, field->data, before_hop);
+	out += before_hop;
+	out += xrow_update_field_store(field->route.next_hop, out, out_end);
+	int after_hop = before_hop + field->route.next_hop->size;
+	memcpy(out, field->data + after_hop, field->size - after_hop);
+	return out + field->size - after_hop - saved_out;
+}
diff --git a/test/box/update.result b/test/box/update.result
index ee38c100f..bf76f6638 100644
--- a/test/box/update.result
+++ b/test/box/update.result
@@ -1294,6 +1294,183 @@ vy_s:select()
 vy_s:drop()
 ---
 ...
+--
+-- Complex updates. Intersected paths, different types.
+--
+--
+-- =
+--
+-- Difference in the last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][1][2]', 375}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+      'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[325, 375], [400,
+        450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+-- Different starting from the non-last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][2][2]', 475}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
+      'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[325, 350], [400,
+        475]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+-- Contains string keys in the middle.
+t:update({{'=', '[2].c.f[1]', 4000}, {'=', '[2].c.f[2]', 5000}})
+---
+- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4000, 5000, 6, 7, 8],
+      'e': 500, 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350],
+      [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
+...
+t = {1}
+---
+...
+t[2] = setmetatable({}, {__serialize = 'map'})
+---
+...
+t[3] = {}
+---
+...
+t[4] = {                        \
+    1,                          \
+    2,                          \
+    3,                          \
+    {                           \
+        4,                      \
+        5,                      \
+        6,                      \
+        7,                      \
+        {                       \
+            8,                  \
+            9,                  \
+            {10, 11, 12},       \
+            13,                 \
+            14,                 \
+        },                      \
+        15,                     \
+        16,                     \
+        {17, 18, 19}            \
+    },                          \
+    20,                         \
+    {                           \
+        a = 21,                 \
+        b = 22,                 \
+        c = {                   \
+            d = 23,             \
+            e = 24              \
+        }                       \
+    },                          \
+    25                          \
+}
+---
+...
+t = s:replace(t)
+---
+...
+-- Non-first and non-last field updates.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][8][3]', 19000}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11000, 12], 13, 14], 15, 16, [17,
+        18, 19000]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Triple intersection with enabled fast jump by route.
+t:update({{'=', '[4][4][3]', 6000}, {'=', '[4][4][5][2]', 9000}, {'=', '[4][4][8][2]', 18000}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6000, 7, [8, 9000, [10, 11, 12], 13, 14], 15, 16, [
+        17, 18000, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Triple intersection with no optimizations.
+t:update({{'=', '[4][3]', 3000}, {'=', '[4][4][2]', 5000}, {'=', '[4][4][5][1]', 8000}})
+---
+- [1, {}, [], [1, 2, 3000, [4, 5000, 6, 7, [8000, 9, [10, 11, 12], 13, 14], 15, 16,
+      [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Last operation splits previous ones.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][5][3][1]', 10000}, {'=', '[4][4][3]', 6000}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6000, 7, [8, 9, [10000, 11000, 12], 13, 14], 15, 16,
+      [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- First two operations generate a route '[4]' in the 4th field of
+-- the tuple. With childs [1] and [2]. The last operation should
+-- delete the route and replace it with a full featured array.
+t:update({{'=', '[4][4][1]', 4.5}, {'=', '[4][4][2]', 5.5}, {'=', '[4][3]', 3.5}})
+---
+- [1, {}, [], [1, 2, 3.5, [4.5, 5.5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [
+        17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Test errors.
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][5].a', -1}})
+---
+- error: 'Field ''[4][4][5].a'' UPDATE error: can not update array by non-integer
+    index'
+...
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][*].a', -1}})
+---
+- error: 'Field ''[4][4][*].a'' UPDATE error: can not update array by non-integer
+    index'
+...
+t:update({{'=', '[4][4][*].b', 8000}, {'=', '[4][4][*].a', -1}})
+---
+- error: 'Field ''[4][4][*].b'' UPDATE error: invalid JSON in position 8'
+...
+t:update({{'=', '[4][4][1]', 4000}, {'=', '[4][4]-badjson', -1}})
+---
+- error: 'Field ''[4][4]-badjson'' UPDATE error: invalid JSON in position 7'
+...
+t:update({{'=', '[4][4][1]', 1}, {'=', '[4][4][1]', 2}})
+---
+- error: 'Field ''[4][4][1]'' UPDATE error: double update of the same field'
+...
+-- First two operations produce zero length bar update in field
+-- [4][4][5][4]. The third operation should raise an error.
+t:update({{'+', '[4][4][5][4]', 13000}, {'=', '[4][4][5][1]', 8000}, {'+', '[4][4][5][4]', 1}})
+---
+- error: 'Field ''[4][4][5][4]'' UPDATE error: double update of the same field'
+...
+--
+-- !
+--
+t:update({{'!', '[4][5]', 19.5}, {'!', '[4][6]', 19.75}, {'!', '[4][4][3]', 5.5}, {'=', '[4][4][2]', 5.25}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5.25, 5.5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16,
+      [17, 18, 19]], 19.5, 19.75, 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}},
+    25]]
+...
+--
+-- #
+--
+t:update({{'#', '[4][4][5]', 1}, {'#', '[4][4][4]', 1}, {'#', '[4][4][5]', 1}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6, 15, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {
+        'd': 23, 'e': 24}}, 25]]
+...
+--
+-- Scalar operations.
+--
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}})
+---
+- [1, {}, [], [1, 2, 3.5, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14.75], 15, 16, [
+        17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}, {'+', '[4][4][3]', 0.25}})
+---
+- [1, {}, [], [1, 2, 3.5, [4, 5, 6.25, 7, [8, 9, [10, 11, 12], 13, 14.75], 15, 16,
+      [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Combined - check that operations following '!' take into
+-- account that there is a new element. Indexes should be relative
+-- to the new array size.
+t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]', 0.25}})
+---
+- [1, {}, [], [1, 2, 2.5, 3, [4, 5, 6.25, 7, [8, 9, [10, 11, 12], 13, 14.75], 15,
+      16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+-- Error - an attempt to follow JSON path in a scalar field
+-- during branching.
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
+---
+- error: Field ''[4][3][2]'' was not found in the tuple
+...
 s:drop()
 ---
 ...
diff --git a/test/box/update.test.lua b/test/box/update.test.lua
index 60e669d27..d91a2e95c 100644
--- a/test/box/update.test.lua
+++ b/test/box/update.test.lua
@@ -424,4 +424,95 @@ box.commit()
 vy_s:select()
 vy_s:drop()
 
+--
+-- Complex updates. Intersected paths, different types.
+--
+
+--
+-- =
+--
+-- Difference in the last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][1][2]', 375}})
+-- Different starting from the non-last part.
+t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][2][2]', 475}})
+-- Contains string keys in the middle.
+t:update({{'=', '[2].c.f[1]', 4000}, {'=', '[2].c.f[2]', 5000}})
+t = {1}
+t[2] = setmetatable({}, {__serialize = 'map'})
+t[3] = {}
+t[4] = {                        \
+    1,                          \
+    2,                          \
+    3,                          \
+    {                           \
+        4,                      \
+        5,                      \
+        6,                      \
+        7,                      \
+        {                       \
+            8,                  \
+            9,                  \
+            {10, 11, 12},       \
+            13,                 \
+            14,                 \
+        },                      \
+        15,                     \
+        16,                     \
+        {17, 18, 19}            \
+    },                          \
+    20,                         \
+    {                           \
+        a = 21,                 \
+        b = 22,                 \
+        c = {                   \
+            d = 23,             \
+            e = 24              \
+        }                       \
+    },                          \
+    25                          \
+}
+t = s:replace(t)
+-- Non-first and non-last field updates.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][8][3]', 19000}})
+-- Triple intersection with enabled fast jump by route.
+t:update({{'=', '[4][4][3]', 6000}, {'=', '[4][4][5][2]', 9000}, {'=', '[4][4][8][2]', 18000}})
+-- Triple intersection with no optimizations.
+t:update({{'=', '[4][3]', 3000}, {'=', '[4][4][2]', 5000}, {'=', '[4][4][5][1]', 8000}})
+-- Last operation splits previous ones.
+t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][5][3][1]', 10000}, {'=', '[4][4][3]', 6000}})
+-- First two operations generate a route '[4]' in the 4th field of
+-- the tuple. With childs [1] and [2]. The last operation should
+-- delete the route and replace it with a full featured array.
+t:update({{'=', '[4][4][1]', 4.5}, {'=', '[4][4][2]', 5.5}, {'=', '[4][3]', 3.5}})
+-- Test errors.
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][5].a', -1}})
+t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][*].a', -1}})
+t:update({{'=', '[4][4][*].b', 8000}, {'=', '[4][4][*].a', -1}})
+t:update({{'=', '[4][4][1]', 4000}, {'=', '[4][4]-badjson', -1}})
+t:update({{'=', '[4][4][1]', 1}, {'=', '[4][4][1]', 2}})
+-- First two operations produce zero length bar update in field
+-- [4][4][5][4]. The third operation should raise an error.
+t:update({{'+', '[4][4][5][4]', 13000}, {'=', '[4][4][5][1]', 8000}, {'+', '[4][4][5][4]', 1}})
+
+--
+-- !
+--
+t:update({{'!', '[4][5]', 19.5}, {'!', '[4][6]', 19.75}, {'!', '[4][4][3]', 5.5}, {'=', '[4][4][2]', 5.25}})
+--
+-- #
+--
+t:update({{'#', '[4][4][5]', 1}, {'#', '[4][4][4]', 1}, {'#', '[4][4][5]', 1}})
+--
+-- Scalar operations.
+--
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}})
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}, {'+', '[4][4][3]', 0.25}})
+-- Combined - check that operations following '!' take into
+-- account that there is a new element. Indexes should be relative
+-- to the new array size.
+t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]', 0.25}})
+-- Error - an attempt to follow JSON path in a scalar field
+-- during branching.
+t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
+
 s:drop()
-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps
  2019-12-23 22:41 [Tarantool-patches] [PATCH 0/3] JSON update Vladislav Shpilevoy
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable Vladislav Shpilevoy
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 2/3] tuple: JSON path update intersection at arrays Vladislav Shpilevoy
@ 2019-12-23 22:41 ` Vladislav Shpilevoy
  2019-12-27 14:57   ` Sergey Ostanevich
  2019-12-30  5:26 ` [Tarantool-patches] [PATCH 0/3] JSON update Kirill Yukhin
  3 siblings, 1 reply; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-23 22:41 UTC (permalink / raw)
  To: tarantool-patches

Previous commits introduced isolated JSON updates, and then
allowed intersection at array. This one completes the puzzle,
adding intersection at maps. Now JSON updates are fully
available.

Closes #1261

@TarantoolBot document
Title: JSON updates
Tuple/space/index:update/upsert now support JSON paths. All the
same paths as allowed in tuple["..."].

Example:
box.cfg{}
format = {}
format[1] = {'field1', 'unsigned'}
format[2] = {'field2', 'map'}
format[3] = {'field3', 'array'}
format[4] = {'field4', 'string', is_nullable = true}
s = box.schema.create_space('test', {format = format})
_ = s:create_index('pk')
t = {
    1,
    {
        key1 = 'value',
        key2 = 10
    },
    {
        2,
        3,
        {key3 = 20}
    }
}
t = s:replace(t)

tarantool> t:update({{'=', 'field2.key1', 'new_value'}})
---
- [1, {'key1': 'new_value', 'key2': 10}, [2, 3, {'key3': 20}]]
...

tarantool> t:update({{'+', 'field3[2]', 1}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 4, {'key3': 20}]]
...

tarantool> s:update({1}, {{'!', 'field4', 'inserted value'}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 3, {'key3': 20}], 'inserted value']
...

tarantool> s:update({1}, {{'#', '[2].key2', 1}, {'=', '[3][3].key4', 'value4'}})
---
- [1, {'key1': 'value'}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...

tarantool> s:upsert({1, {k = 'v'}, {}}, {{'#', '[2].key1', 1}})
---
...

tarantool> s:select{}
---
- - [1, {}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...

Note, that there is the same rule, as in tuple field access by
JSON, for field names looking like JSON paths. The rule is that
firstly the whole path is interpreted as a field name. If such a
name does not exist, then it is treated as a path. For example,
if there is a field name 'field.name.like.json', then this update

    <obj>:update({..., 'field.name.like.json', ...})

will update this field, instead of keys 'field' -> 'name' ->
'like' -> 'json'. If such a name is needed as a part of a bigger
path, then it should be wrapped in quotes and []:

    <obj>:update({..., '["field.name.like.json"].next.fields', ...})

There are some new rules for JSON updates:

- Operation '!' can't be used to create all intermediate nodes of
  a path. For example, {'!', 'field1[1].field3', ...} can't
  create fields 'field1' and '[1]', they should exist.

- Operation '#', when applied to maps, can't delete more than one
  key at once. That is, its argument should be always 1 for maps.

    {'#', 'field1.field2', 1} - this is allowed;
    {'#', 'field1.field2', 10} - this is not.

  That limitation originates from a problem, that keys in a map
  are not ordered anyhow, and '#' with more than 1 key would lead
  to undefined behaviour.

- Operation '!' on maps can't create a key, if it exists already.

- If a map contains non-string keys (booleans, numbers, maps,
  arrays - anything), then these keys can't be updated via JSON
  paths. But it is still allowed to update string keys in such a
  map.

Why JSON updates are good, and should be preferred when only a
part of a tuple needs to be updated?

- They consume less space in WAL, because for an update only its
  keys, operations, and arguments are stored. It is cheaper to
  store update of one deep field, than the whole tuple.

- They are faster. Firstly, this is because they are implemented
  in C, and have no problems with Lua GC and dynamic typing.
  Secondly, some cases of JSON paths are highly optimized. For
  example, an update with a single JSON path costs O(1) memory
  regardless of how deep that path goes (not counting update
  arguments).

- They are available from remote clients, as well as any other
  DML. Before JSON updates to update one deep part of a tuple it
  would be necessary to download that tuple, update it in memory,
  send it back - 2 network hops. With JSON paths it can be 1 when
  the update can be described in paths.
---
 src/box/CMakeLists.txt      |   1 +
 src/box/xrow_update_field.c |   4 +
 src/box/xrow_update_field.h |  85 ++++++-
 src/box/xrow_update_map.c   | 453 ++++++++++++++++++++++++++++++++++++
 src/box/xrow_update_route.c |  47 +++-
 test/box/update.result      | 152 ++++++++++++
 test/box/update.test.lua    |  75 ++++++
 7 files changed, 813 insertions(+), 4 deletions(-)
 create mode 100644 src/box/xrow_update_map.c

diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index f99efc503..2022b378b 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -45,6 +45,7 @@ add_library(tuple STATIC
     xrow_update_array.c
     xrow_update_bar.c
     xrow_update_route.c
+    xrow_update_map.c
     tuple_compare.cc
     tuple_extract_key.cc
     tuple_hash.cc
diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
index 3a4aa6c90..a8db31768 100644
--- a/src/box/xrow_update_field.c
+++ b/src/box/xrow_update_field.c
@@ -116,6 +116,8 @@ xrow_update_field_sizeof(struct xrow_update_field *field)
 		return xrow_update_bar_sizeof(field);
 	case XUPDATE_ROUTE:
 		return xrow_update_route_sizeof(field);
+	case XUPDATE_MAP:
+		return xrow_update_map_sizeof(field);
 	default:
 		unreachable();
 	}
@@ -145,6 +147,8 @@ xrow_update_field_store(struct xrow_update_field *field, char *out,
 		return xrow_update_bar_store(field, out, out_end);
 	case XUPDATE_ROUTE:
 		return xrow_update_route_store(field, out, out_end);
+	case XUPDATE_MAP:
+		return xrow_update_map_store(field, out, out_end);
 	default:
 		unreachable();
 	}
diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
index f07b5a649..6b2a0d999 100644
--- a/src/box/xrow_update_field.h
+++ b/src/box/xrow_update_field.h
@@ -32,6 +32,7 @@
  */
 #include "trivia/util.h"
 #include "tt_static.h"
+#include "salad/stailq.h"
 #include "json/json.h"
 #include "bit/int96.h"
 #include "mp_decimal.h"
@@ -254,7 +255,8 @@ enum xrow_update_type {
 	 * for example, when a rope is split in two parts: not
 	 * changed left range of fields, and a right range with
 	 * its first field changed. In this case the left range is
-	 * NOP.
+	 * NOP. And when a map is updated and split into rages,
+	 * when only certain points are not NOP.
 	 */
 	XUPDATE_NOP,
 	/**
@@ -292,6 +294,11 @@ enum xrow_update_type {
 	 * '[2] = 30', '[3] = true'.
 	 */
 	XUPDATE_ROUTE,
+	/**
+	 * Field is an updated map. Check item list for updates of
+	 * individual fields.
+	 */
+	XUPDATE_MAP,
 };
 
 /**
@@ -383,6 +390,48 @@ struct xrow_update_field {
 			/** Update subtree. */
 			struct xrow_update_field *next_hop;
 		} route;
+		/**
+		 * The field is an updated map. Individual fields
+		 * are stored very similar to array update and its
+		 * rope nodes. Each item is a key, a value, and a
+		 * tail of unchanged key-value pairs. The items
+		 * are stored in a list. But the list is not
+		 * sorted anyhow by keys, because it does not
+		 * really make any sense:
+		 *
+		 * 1) Keys in MessagePack are not sorted anyway,
+		 *    and any kind of search would not be possible
+		 *    even if they were sorted. Sort of a map
+		 *    would require Nlog(N) time and N memory even
+		 *    if only a few fields were updated.
+		 *
+		 * 2) Double scalar update of the same key is not
+		 *    possible.
+		 *
+		 * Although there is something which can be and is
+		 * optimized. When a key is updated first time,
+		 * it is moved to the beginning of the list, and
+		 * after all operations are done, it is stored
+		 * in the result tuple before unchanged fields. On
+		 * a second update of the same tuple it is found
+		 * immediately.
+		 */
+		struct {
+			/**
+			 * List of map update items. Each item is
+			 * a key, a value, and an unchanged tail.
+			 */
+			struct stailq items;
+			/**
+			 * Number of key-value pairs in the list.
+			 * Note, it is not a number of items, but
+			 * exactly number of key-value pairs. It
+			 * is used to store MessagePack map header
+			 * without decoding each item again just
+			 * to learn the size.
+			 */
+			int size;
+		} map;
 	};
 };
 
@@ -506,6 +555,38 @@ OP_DECL_GENERIC(array)
 
 /* }}} xrow_update_field.array */
 
+/* {{{ xrow_update_field.map */
+
+/**
+ * Initialize @a field as a map to update.
+ * @param[out] field Field to initialize.
+ * @param header Header of the MessagePack map @a data.
+ * @param data MessagePack data of the map to update.
+ * @param data_end End of @a data.
+ * @param field_count Key-value pair count in @data.
+ *
+ * @retval  0 Success.
+ * @retval -1 Error.
+ */
+int
+xrow_update_map_create(struct xrow_update_field *field, const char *header,
+		       const char *data, const char *data_end, int field_count);
+
+/**
+ * Create a map update with an existing child. Motivation is
+ * exactly the same as with a similar array constructor. It allows
+ * to avoid unnecessary MessagePack decoding.
+ */
+int
+xrow_update_map_create_with_child(struct xrow_update_field *field,
+				  const char *header,
+				  const struct xrow_update_field *child,
+				  const char *key, uint32_t key_len);
+
+OP_DECL_GENERIC(map)
+
+/* }}} xrow_update_field.map */
+
 /* {{{ update_field.bar */
 
 OP_DECL_GENERIC(bar)
@@ -587,6 +668,8 @@ xrow_update_op_do_field_##op_type(struct xrow_update_op *op,			\
 		return xrow_update_op_do_bar_##op_type(op, field);		\
 	case XUPDATE_ROUTE:							\
 		return xrow_update_op_do_route_##op_type(op, field);		\
+	case XUPDATE_MAP:							\
+		return xrow_update_op_do_map_##op_type(op, field);		\
 	default:								\
 		unreachable();							\
 	}									\
diff --git a/src/box/xrow_update_map.c b/src/box/xrow_update_map.c
new file mode 100644
index 000000000..22fabff96
--- /dev/null
+++ b/src/box/xrow_update_map.c
@@ -0,0 +1,453 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "xrow_update_field.h"
+#include "fiber.h"
+#include "small/region.h"
+#include "tuple_format.h"
+
+/**
+ * Descriptor of one updated key-value pair. Besides updated data
+ * it contains a tail with unchanged pairs, just so as not to
+ * create a separate object for them, and to be similar with array
+ * update items.
+ */
+struct xrow_update_map_item {
+	/**
+	 * Updated key. Can be NULL. In such a case this item
+	 * contains only an unchanged tail. A key can be nullified
+	 * if it was removed from the map, or when a map is just
+	 * created and has no any update yet.
+	 */
+	const char *key;
+	/** Length of @a key. */
+	uint32_t key_len;
+	/** Updated value. */
+	struct xrow_update_field field;
+	/** Link in the list of updated map keys. */
+	struct stailq_entry in_items;
+	/**
+	 * Size in bytes of unchanged tail data. It goes right
+	 * after @a field.data.
+	 */
+	uint32_t tail_size;
+};
+
+static inline struct xrow_update_map_item *
+xrow_update_map_item_alloc(void)
+{
+	struct xrow_update_map_item *item = (struct xrow_update_map_item *)
+		region_alloc(&fiber()->gc, sizeof(*item));
+	if (item == NULL)
+		diag_set(OutOfMemory, sizeof(*item), "region_alloc", "item");
+	return item;
+}
+
+static void
+xrow_update_map_create_item(struct xrow_update_field *field,
+			    struct xrow_update_map_item *item,
+			    enum xrow_update_type type, const char *key,
+			    uint32_t key_len, const char *data,
+			    uint32_t data_size, uint32_t tail_size)
+{
+	assert(field->type == XUPDATE_MAP);
+	item->key = key;
+	item->key_len = key_len;
+	item->field.type = type;
+	item->field.data = data;
+	item->field.size = data_size,
+	item->tail_size = tail_size;
+	/*
+	 * Each time a new item it created it is stored in the
+	 * head of update map item list. It helps in case the
+	 * tuple is regularly updated, because on all next updates
+	 * this key will be found from the very beginning of the
+	 * map.
+	 */
+	stailq_add_entry(&field->map.items, item, in_items);
+}
+
+static inline struct xrow_update_map_item *
+xrow_update_map_new_item(struct xrow_update_field *field,
+			 enum xrow_update_type type, const char *key,
+			 uint32_t key_len, const char *data, uint32_t data_size,
+			 uint32_t tail_size)
+{
+	struct xrow_update_map_item *item = xrow_update_map_item_alloc();
+	if (item != NULL) {
+		xrow_update_map_create_item(field, item, type, key, key_len,
+					    data, data_size, tail_size);
+	}
+	return item;
+}
+
+/**
+ * Find an update item to which @a op should be applied. The
+ * target field may do not exist, but at least its parent should.
+ */
+static int
+xrow_update_map_extract_opt_item(struct xrow_update_field *field,
+				 struct xrow_update_op *op,
+				 struct xrow_update_map_item **res)
+{
+	assert(field->type == XUPDATE_MAP);
+	if (op->token_type == JSON_TOKEN_END) {
+		if (xrow_update_op_consume_token(op) != 0)
+			return -1;
+		if (op->token_type != JSON_TOKEN_STR) {
+			return xrow_update_err(op, "can't update a map by not "\
+					       "a string key");
+		}
+	}
+	struct stailq *items = &field->map.items;
+	struct xrow_update_map_item *i, *new_item;
+	/*
+	 * Firstly, try to find the key among already updated
+	 * ones. Perhaps, the needed node is just an intermediate
+	 * key of a bigger JSON path, and there are many updates
+	 * passing this key, so it should be here for all except
+	 * first updates.
+	 */
+	stailq_foreach_entry(i, items, in_items) {
+		if (i->key != NULL && i->key_len == op->key_len &&
+		    memcmp(i->key, op->key, i->key_len) == 0) {
+			*res = i;
+			return 0;
+		}
+	}
+	/*
+	 * Slow path - the key is updated first time, need to
+	 * decode tails.
+	 */
+	uint32_t key_len, i_tail_size;
+	const char *pos, *key, *end, *tmp, *begin;
+	stailq_foreach_entry(i, items, in_items) {
+		begin = i->field.data + i->field.size;
+		pos = begin;
+		end = begin + i->tail_size;
+		i_tail_size = 0;
+		while(pos < end) {
+			if (mp_typeof(*pos) != MP_STR) {
+				mp_next(&pos);
+				mp_next(&pos);
+				continue;
+			}
+			i_tail_size = pos - begin;
+			key = mp_decode_str(&pos, &key_len);
+			if (key_len == op->key_len &&
+			    memcmp(key, op->key, key_len) == 0)
+				goto key_is_found;
+			mp_next(&pos);
+		}
+	}
+	*res = NULL;
+	return 0;
+
+key_is_found:
+	tmp = pos;
+	mp_next(&tmp);
+	if (i_tail_size == 0 && i->key == NULL) {
+		/*
+		 * Looks like the needed key was found from the
+		 * beginning of tail of an item without a key.
+		 * This is actually good, because this item can
+		 * be transformed instead of a new item creation.
+		 */
+		i->key = op->key;
+		i->key_len = op->key_len;
+		i->field.data = pos;
+		i->field.size = tmp - pos;
+		i->tail_size = end - tmp;
+		new_item = i;
+	} else {
+		new_item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
+						    op->key_len, pos, tmp - pos,
+						    end - tmp);
+		if (new_item == NULL)
+			return -1;
+		i->tail_size = i_tail_size;
+	}
+	*res = new_item;
+	return 0;
+}
+
+/**
+ * The same as optional extractor, but the field to update should
+ * exist. It is the case of all scalar operations (except '=' - it
+ * can work as insert).
+ */
+static inline struct xrow_update_map_item *
+xrow_update_map_extract_item(struct xrow_update_field *field,
+			     struct xrow_update_op *op)
+{
+	assert(field->type == XUPDATE_MAP);
+	struct xrow_update_map_item *res;
+	if (xrow_update_map_extract_opt_item(field, op, &res) != 0)
+		return NULL;
+	if (res == NULL)
+		xrow_update_err_no_such_field(op);
+	return res;
+}
+
+int
+xrow_update_op_do_map_insert(struct xrow_update_op *op,
+			     struct xrow_update_field *field)
+{
+	assert(field->type == XUPDATE_MAP);
+	struct xrow_update_map_item *item;
+	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
+		return -1;
+	if (! xrow_update_op_is_term(op)) {
+		if (item == NULL)
+			return xrow_update_err_no_such_field(op);
+		op->token_type = JSON_TOKEN_END;
+		return xrow_update_op_do_field_insert(op, &item->field);
+	}
+	if (item != NULL)
+		return xrow_update_err_duplicate(op);
+	++field->map.size;
+	item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
+					op->key_len, op->arg.set.value,
+					op->arg.set.length, 0);
+	return item != NULL ? 0 : -1;
+}
+
+int
+xrow_update_op_do_map_set(struct xrow_update_op *op,
+			  struct xrow_update_field *field)
+{
+	assert(field->type == XUPDATE_MAP);
+	struct xrow_update_map_item *item;
+	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
+		return -1;
+	if (! xrow_update_op_is_term(op)) {
+		if (item == NULL)
+			return xrow_update_err_no_such_field(op);
+		op->token_type = JSON_TOKEN_END;
+		return xrow_update_op_do_field_set(op, &item->field);
+	}
+	if (item != NULL) {
+		op->new_field_len = op->arg.set.length;
+		/* Ignore the previous op, if any. */
+		item->field.type = XUPDATE_SCALAR;
+		item->field.scalar.op = op;
+		return 0;
+	}
+	++field->map.size;
+	item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
+					op->key_len, op->arg.set.value,
+					op->arg.set.length, 0);
+	return item != NULL ? 0 : -1;
+}
+
+int
+xrow_update_op_do_map_delete(struct xrow_update_op *op,
+			     struct xrow_update_field *field)
+{
+	assert(field->type == XUPDATE_MAP);
+	struct xrow_update_map_item *item;
+	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
+		return -1;
+	if (! xrow_update_op_is_term(op)) {
+		if (item == NULL)
+			return xrow_update_err_no_such_field(op);
+		op->token_type = JSON_TOKEN_END;
+		return xrow_update_op_do_field_delete(op, &item->field);
+	}
+	if (op->arg.del.count != 1)
+		return xrow_update_err_delete1(op);
+	if (item == NULL)
+		return 0;
+	/*
+	 * Note, even if tail size becomes 0, this item is not
+	 * deleted. This is because items are linked via stailq,
+	 * elements of which can't be deleted as simple as that.
+	 * But it is not a big deal, because '#' is a really rare
+	 * operation.
+	 * Why just a next key from the tail can't be decoded?
+	 * Why key should be NULL here? This is because the JSON
+	 * updates allow to update a map containing non-string
+	 * keys. If the next key is not a string, it can't be
+	 * saved as key of the item. Therefore, it is better not
+	 * to touch unchanged tails unless a new update operation
+	 * needs it.
+	 */
+	item->key = NULL;
+	item->key_len = 0;
+	item->field.data += item->field.size;
+	item->field.size = 0;
+	item->field.type = XUPDATE_NOP;
+	--field->map.size;
+	return 0;
+}
+
+#define DO_SCALAR_OP_GENERIC(op_type)						\
+int										\
+xrow_update_op_do_map_##op_type(struct xrow_update_op *op,			\
+				struct xrow_update_field *field)		\
+{										\
+	assert(field->type == XUPDATE_MAP);					\
+	struct xrow_update_map_item *item =					\
+		xrow_update_map_extract_item(field, op);			\
+	if (item == NULL)							\
+		return -1;							\
+	if (! xrow_update_op_is_term(op)) {					\
+		op->token_type = JSON_TOKEN_END;				\
+		return xrow_update_op_do_field_##op_type(op, &item->field);	\
+	}									\
+	if (item->field.type != XUPDATE_NOP)					\
+		return xrow_update_err_double(op);				\
+	if (xrow_update_op_do_##op_type(op, item->field.data) != 0)		\
+		return -1;							\
+	item->field.type = XUPDATE_SCALAR;					\
+	item->field.scalar.op = op;						\
+	return 0;								\
+}
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
+
+int
+xrow_update_map_create(struct xrow_update_field *field, const char *header,
+		       const char *data, const char *data_end, int field_count)
+{
+	field->type = XUPDATE_MAP;
+	field->data = header;
+	field->size = data_end - header;
+	field->map.size = field_count;
+	stailq_create(&field->map.items);
+	if (field_count == 0)
+		return 0;
+	struct xrow_update_map_item *first =
+		xrow_update_map_new_item(field, XUPDATE_NOP, NULL, 0, data, 0,
+					 data_end - data);
+	return first != NULL ? 0 : -1;
+}
+
+int
+xrow_update_map_create_with_child(struct xrow_update_field *field,
+				  const char *header,
+				  const struct xrow_update_field *child,
+				  const char *key, uint32_t key_len)
+{
+	field->type = XUPDATE_MAP;
+	field->data = header;
+	stailq_create(&field->map.items);
+
+	const char *pos = header;
+	uint32_t i, ikey_len, field_count = mp_decode_map(&pos);
+	const char *begin = pos;
+	struct xrow_update_map_item *item = NULL;
+	for (i = 0; i < field_count; ++i) {
+		if (mp_typeof(*pos) != MP_STR) {
+			mp_next(&pos);
+			mp_next(&pos);
+			continue;
+		}
+		const char *before_key = pos;
+		const char *ikey = mp_decode_str(&pos, &ikey_len);
+		if (ikey_len == key_len && memcmp(ikey, key, key_len) == 0) {
+			item = xrow_update_map_new_item(field, XUPDATE_NOP,
+							NULL, 0, begin, 0,
+							before_key - begin);
+			if (item == NULL)
+				return -1;
+			++i;
+			break;
+		}
+		mp_next(&pos);
+	}
+	/*
+	 * When a map is initialized with an existing child, it
+	 * means, that it was already found earlier. I.e. it is
+	 * impossible to miss it.
+	 */
+	assert(item != NULL);
+	const char *data = pos;
+	mp_next(&pos);
+	uint32_t data_size = pos - data;
+	for (; i < field_count; ++i) {
+		mp_next(&pos);
+		mp_next(&pos);
+	}
+	item = xrow_update_map_item_alloc();
+	if (item == NULL)
+		return -1;
+	item->field = *child;
+	xrow_update_map_create_item(field, item, child->type, key, key_len,
+				    data, data_size, pos - data - data_size);
+	field->map.size = field_count;
+	field->size = pos - header;
+	return 0;
+}
+
+uint32_t
+xrow_update_map_sizeof(struct xrow_update_field *field)
+{
+	assert(field->type == XUPDATE_MAP);
+	uint32_t res = mp_sizeof_map(field->map.size);
+	struct xrow_update_map_item *i;
+	stailq_foreach_entry(i, &field->map.items, in_items) {
+		res += i->tail_size;
+		if (i->key != NULL) {
+			res += mp_sizeof_str(i->key_len) +
+			       xrow_update_field_sizeof(&i->field);
+		}
+	}
+	return res;
+}
+
+uint32_t
+xrow_update_map_store(struct xrow_update_field *field, char *out, char *out_end)
+{
+	assert(field->type == XUPDATE_MAP);
+	char *out_begin = out;
+	out = mp_encode_map(out, field->map.size);
+	struct xrow_update_map_item *i;
+	/*
+	 * This is the trick about saving updated keys before
+	 * others. The first cycle doesn't save unchanged tails.
+	 */
+	stailq_foreach_entry(i, &field->map.items, in_items) {
+		if (i->key != NULL) {
+			out = mp_encode_str(out, i->key, i->key_len);
+			out += xrow_update_field_store(&i->field, out, out_end);
+		}
+	}
+	stailq_foreach_entry(i, &field->map.items, in_items) {
+		memcpy(out, i->field.data + i->field.size, i->tail_size);
+		out += i->tail_size;
+	}
+	assert(out <= out_end);
+	return out - out_begin;
+}
diff --git a/src/box/xrow_update_route.c b/src/box/xrow_update_route.c
index f374db837..b76ebb36b 100644
--- a/src/box/xrow_update_route.c
+++ b/src/box/xrow_update_route.c
@@ -123,6 +123,38 @@ xrow_update_route_branch_array(struct xrow_update_field *next_hop,
 	return op->meta->do_op(op, next_hop);
 }
 
+/**
+ * Do the actual branch, but by a map and a key in that map. Works
+ * exactly the same as the array-counterpart.
+ */
+static int
+xrow_update_route_branch_map(struct xrow_update_field *next_hop,
+			     const char *parent,
+			     const struct xrow_update_field *child,
+			     const char *key, int key_len)
+{
+	struct xrow_update_op *op = child->bar.op;
+	if (child->type != XUPDATE_BAR || child->bar.path_len > 0 ||
+	    (op->opcode != '!' && op->opcode != '#')) {
+		return xrow_update_map_create_with_child(next_hop, parent,
+							 child, key, key_len);
+	}
+	op->token_type = JSON_TOKEN_STR;
+	op->key = key;
+	op->key_len = key_len;
+	const char *data = parent;
+	uint32_t field_count = mp_decode_map(&data);
+	const char *end = data;
+	for (uint32_t i = 0; i < field_count; ++i) {
+		mp_next(&end);
+		mp_next(&end);
+	}
+	if (xrow_update_map_create(next_hop, parent, data, end,
+				   field_count) != 0)
+		return -1;
+	return op->meta->do_op(op, next_hop);
+}
+
 struct xrow_update_field *
 xrow_update_route_branch(struct xrow_update_field *field,
 			 struct xrow_update_op *new_op)
@@ -256,9 +288,18 @@ xrow_update_route_branch(struct xrow_update_field *field,
 						   old_token.num) != 0)
 			return NULL;
 	} else if (type == MP_MAP) {
-		diag_set(ClientError, ER_UNSUPPORTED, "update",
-			 "path intersection on map");
-		return NULL;
+		if (new_token.type != JSON_TOKEN_STR) {
+			xrow_update_err(new_op, "can not update map by "\
+					"non-string key");
+			return NULL;
+		}
+		new_op->token_type = JSON_TOKEN_STR;
+		new_op->key = new_token.str;
+		new_op->key_len = new_token.len;
+		if (xrow_update_route_branch_map(next_hop, parent, &child,
+						 old_token.str,
+						 old_token.len) != 0)
+			return NULL;
 	} else {
 		xrow_update_err_no_such_field(new_op);
 		return NULL;
diff --git a/test/box/update.result b/test/box/update.result
index bf76f6638..4f9349bce 100644
--- a/test/box/update.result
+++ b/test/box/update.result
@@ -1471,6 +1471,158 @@ t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
 ---
 - error: Field ''[4][3][2]'' was not found in the tuple
 ...
+--
+-- Intersecting map updates.
+--
+t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [17, 18,
+        19]], 20, {'b': 22, 'a': 21, 'c': {'d': 2300, 'e': 2400}}, 25]]
+...
+-- Fill an empty map.
+t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
+---
+- [1, {'k': 100, 'v': 200}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14],
+      15, 16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+t = {}                                  \
+t[1] = 1                                \
+t[2] = {                                \
+    a = 1,                              \
+    b = 2,                              \
+    c = {3, 4, 5},                      \
+    d = {                               \
+        [0] = 0,                        \
+        e = 6,                          \
+        [true] = false,                 \
+        f = 7,                          \
+        [-1] = -1,                      \
+    },                                  \
+    g = {                               \
+        {k = 8, v = 9},                 \
+        [box.NULL] = box.NULL,          \
+        {k = 10, v = 11},               \
+        {k = '12str', v = '13str'},     \
+    },                                  \
+    h = {                               \
+        [-1] = {{{{-1}, -1}, -1}, -1},  \
+        i = {                           \
+            j = 14,                     \
+            k = 15,                     \
+        },                              \
+        m = 16,                         \
+    }                                   \
+}
+---
+...
+t[3] = {}
+---
+...
+t = box.tuple.new(t)
+---
+...
+-- Insert and delete from the same map at once.
+t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
+---
+- [1, {'a': 1, 'n': 17, 'd': {0: 0, -1: -1, true: false, 'f': 7, 'e': 6}, 'h': {'i': {
+        'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3, 4, 5], 'g': {
+      1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+      null: null}}, []]
+...
+-- Delete a not existing key.
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
+---
+- [1, {'b': 2, 'a': 1, 'd': {0: 0, 'f': 7, 'e': 6, -1: -1, true: false, 'g': 6000},
+    'h': {'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3,
+      4, 5], 'g': {1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+      null: null}}, []]
+...
+-- Scalar operations.
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
+---
+- [1, {'b': 3, 'c': [3, 4, 5], 'd': {0: 0, true: false, 'f': 8, -1: -1, 'e': 7}, 'h': {
+      'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'a': 1, 'g': {
+      1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+      null: null}}, []]
+...
+-- Errors.
+-- This operation array creates an update tree with several full
+-- featured maps, not bars. It is required to cover some complex
+-- cases about fields extraction.
+ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
+---
+...
+-- When a map update extracts a next token on demand, it should
+-- reject bad json, obviously. Scalar and '!'/'#' operations are
+-- implemented separately and tested both.
+ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+ops[#ops][1] = '!'
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+ops[#ops][1] = '#'
+---
+...
+ops[#ops][3] = 1
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+-- Key extractor should reject any attempt to use non-string keys.
+ops[#ops] = {'-', '[2].h.i[20]', -1}
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
+    key'
+...
+ops[#ops][1] = '='
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
+    key'
+...
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
+---
+- error: Field ''[2].d.k'' was not found in the tuple
+...
+t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
+---
+- error: Field ''[2].d.g.h'' was not found in the tuple
+...
+t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
+---
+- error: 'Field ''[2].d.e'' UPDATE error: the key exists already'
+...
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
+---
+- error: 'Field ''[2].d.old'' UPDATE error: can delete only 1 field from a map in
+    a row'
+...
+-- '!'/'=' can be used to create a field, but only if it is a
+-- tail. These operations can't create the whole path.
+t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
+---
+- error: Field ''[2].d.new.new'' was not found in the tuple
+...
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
+---
+- error: Field ''[2].d.old.old'' was not found in the tuple
+...
 s:drop()
 ---
 ...
diff --git a/test/box/update.test.lua b/test/box/update.test.lua
index d91a2e95c..6fed818f4 100644
--- a/test/box/update.test.lua
+++ b/test/box/update.test.lua
@@ -515,4 +515,79 @@ t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]',
 -- during branching.
 t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
 
+--
+-- Intersecting map updates.
+--
+t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
+-- Fill an empty map.
+t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
+
+t = {}                                  \
+t[1] = 1                                \
+t[2] = {                                \
+    a = 1,                              \
+    b = 2,                              \
+    c = {3, 4, 5},                      \
+    d = {                               \
+        [0] = 0,                        \
+        e = 6,                          \
+        [true] = false,                 \
+        f = 7,                          \
+        [-1] = -1,                      \
+    },                                  \
+    g = {                               \
+        {k = 8, v = 9},                 \
+        [box.NULL] = box.NULL,          \
+        {k = 10, v = 11},               \
+        {k = '12str', v = '13str'},     \
+    },                                  \
+    h = {                               \
+        [-1] = {{{{-1}, -1}, -1}, -1},  \
+        i = {                           \
+            j = 14,                     \
+            k = 15,                     \
+        },                              \
+        m = 16,                         \
+    }                                   \
+}
+t[3] = {}
+t = box.tuple.new(t)
+-- Insert and delete from the same map at once.
+t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
+-- Delete a not existing key.
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
+
+-- Scalar operations.
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
+
+-- Errors.
+-- This operation array creates an update tree with several full
+-- featured maps, not bars. It is required to cover some complex
+-- cases about fields extraction.
+ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
+-- When a map update extracts a next token on demand, it should
+-- reject bad json, obviously. Scalar and '!'/'#' operations are
+-- implemented separately and tested both.
+ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
+t:update(ops)
+ops[#ops][1] = '!'
+t:update(ops)
+ops[#ops][1] = '#'
+ops[#ops][3] = 1
+t:update(ops)
+-- Key extractor should reject any attempt to use non-string keys.
+ops[#ops] = {'-', '[2].h.i[20]', -1}
+t:update(ops)
+ops[#ops][1] = '='
+t:update(ops)
+
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
+t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
+t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
+-- '!'/'=' can be used to create a field, but only if it is a
+-- tail. These operations can't create the whole path.
+t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
+
 s:drop()
-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable Vladislav Shpilevoy
@ 2019-12-24 16:15   ` Vladislav Shpilevoy
  2019-12-26 12:07   ` Sergey Ostanevich
  1 sibling, 0 replies; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-24 16:15 UTC (permalink / raw)
  To: tarantool-patches, Konstantin Osipov

Hi! Thanks for the comment!

Please, keep the mailing list in the CC/To next time.

> -------- Forwarded Message --------
> Subject: Re: [PATCH 1/3] tuple: make update operation tokens consumable
> Date: Tue, 24 Dec 2019 08:39:25 +0300
> From: Konstantin Osipov <kostja.osipov@gmail.com>
> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
> 
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/12/24 08:34]:
>> +/**
>> + * Make sure @a op contains a valid field number to where the
>> + * operation should be applied next. Field number may be not
>> + * known, if the array's parent didn't propagate operation's
>> + * lexer. In fact, the parent fills fieldno only in some rare
>> + * cases like branching. Generally, an array should care about
>> + * fieldno by itself.
>> + */
>> +static inline int
>> +xrow_update_op_prepare_num_token(struct xrow_update_op *op)
>> +{
>> +	/*
>> +	 * Token type END is a special value meaning that the
>> +	 * current token needs to be parsed.
>> +	 */
>> +	if (op->token_type == JSON_TOKEN_END &&
>> +	    xrow_update_op_consume_token(op) != 0)
>> +			return -1;
>> +	if (op->token_type != JSON_TOKEN_NUM) {
>> +		return xrow_update_err(op, "can't update an array by not a "\
>> +				       "number index");
> 
> non-numeric index
> 
> 
> -- 
> Konstantin Osipov, Moscow, Russia

Ok, fixed:

diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
index 6a7ce09ff..2eb98e8e9 100644
--- a/src/box/xrow_update_array.c
+++ b/src/box/xrow_update_array.c
@@ -51,8 +51,8 @@ xrow_update_op_prepare_num_token(struct xrow_update_op *op)
 	    xrow_update_op_consume_token(op) != 0)
 			return -1;
 	if (op->token_type != JSON_TOKEN_NUM) {
-		return xrow_update_err(op, "can't update an array by not a "\
-				       "number index");
+		return xrow_update_err(op, "can't update an array by a "\
+				       "non-numeric index");
 	}
 	return 0;
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable Vladislav Shpilevoy
  2019-12-24 16:15   ` Vladislav Shpilevoy
@ 2019-12-26 12:07   ` Sergey Ostanevich
  2019-12-27 13:00     ` Vladislav Shpilevoy
  1 sibling, 1 reply; 13+ messages in thread
From: Sergey Ostanevich @ 2019-12-26 12:07 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hi!

Thanks for the patch! 
Just one comment below, otherwise LGTM.

Sergos

On 23 Dec 23:41, Vladislav Shpilevoy wrote:
> There is a case: [1][2][3][4] = 100. It is not a problem when it
> is a single operation, not intersecting with anything. It is an
> isolated update then, and works ok. But the next patch allows
> several update operations have the same prefix, and the path
> [1][2][3][4] can become a tree of updated arrays. For example, a
> trivial tree like this:
> 
>     root: [ [1] ]
>              |
>              [ [1] [2] ]
>                     |
>                     [ [1] [2] [3] ]
>                                |
>                                [ [1] [2] [3] [4] ]
>                                              =100
> 
> When the update is applied to root, the JSON path [1][2][3][4]
> is decoded one part by one. And the operation goes down the tree
> until reaches the leaf, where [4] = 100 is applied. Each time when
> the update goes one level down, somebody should update
> xrow_update_op.field_no so as on the first level it would be 1,
> then 2, 3, 4.
> 
> Does it mean that each level of the update [1][2][3][4] should
> prepare field_no for the next child? No, because then they would
> need to check type of the child if it is an array or map, or
> whatever expects a valid field_no/key in xrow_update_op, and
> ensure that map-child gets a key, array-child gets a field_no.
> That would complicate the code to a totally unreadable
> state, and would break encapsulation between
> xrow_update_array/map/bar... . Each array update operation would
> check a child for all existing types to ensure that the next token
> matches it. The same would happen to map updates.
> 
> This patch goes another way - let each level of update check if
> its field_no/key is already prepared by the caller. And if not,
> extract a next token from the operation path. So the map update
> will ensure that it has a valid key, an array update will ensure
> that it has a valid field no.
> 
> Part of #1261
> ---
>  src/box/xrow_update_array.c | 47 +++++++++++++++++++++++++++++++++++--
>  src/box/xrow_update_field.c | 22 +++++++++++++++++
>  src/box/xrow_update_field.h | 25 ++++++++++++++++++--
>  3 files changed, 90 insertions(+), 4 deletions(-)
> 
> diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
> index 57427e39c..6a7ce09ff 100644
> --- a/src/box/xrow_update_array.c
> +++ b/src/box/xrow_update_array.c
> @@ -32,6 +32,31 @@
>  #include "msgpuck.h"
>  #include "fiber.h"
>  
> +/**
> + * Make sure @a op contains a valid field number to where the
> + * operation should be applied next. Field number may be not
> + * known, if the array's parent didn't propagate operation's
> + * lexer. In fact, the parent fills fieldno only in some rare
> + * cases like branching. Generally, an array should care about
> + * fieldno by itself.
> + */
> +static inline int
> +xrow_update_op_prepare_num_token(struct xrow_update_op *op)
> +{
> +	/*
> +	 * Token type END is a special value meaning that the
> +	 * current token needs to be parsed.
> +	 */

I really don't like reuse of entities introduced for some purpose. If
one will read this code, then it will be hard to understand why out of
nowhere Lexer forced to come to an end at
xrow_update_op_do_array_insert() for example. More that that - if lexer
in a state of 'reached the end' it forced to parse. 

If we introduce some more functionality for JSON later - I expect we
will face even more logic to hang on this 'special value'. Then
contexts of these functionalities may intersect... oh my...

Can we introduce a new type with some self-explanatory name instead and
leave the JSON_TOKEN_END on its own? 

> +	if (op->token_type == JSON_TOKEN_END &&
> +	    xrow_update_op_consume_token(op) != 0)
> +			return -1;
> +	if (op->token_type != JSON_TOKEN_NUM) {
> +		return xrow_update_err(op, "can't update an array by not a "\
> +				       "number index");
> +	}
> +	return 0;
> +}
> +
>  /**
>   * Make field index non-negative and check for the field
>   * existence.
> @@ -39,6 +64,7 @@
>  static inline int
>  xrow_update_op_adjust_field_no(struct xrow_update_op *op, int32_t field_count)
>  {
> +	assert(op->token_type == JSON_TOKEN_NUM);
>  	if (op->field_no >= 0) {
>  		if (op->field_no < field_count)
>  			return 0;
> @@ -221,10 +247,14 @@ xrow_update_op_do_array_insert(struct xrow_update_op *op,
>  {
>  	assert(field->type == XUPDATE_ARRAY);
>  	struct xrow_update_array_item *item;
> +	if (xrow_update_op_prepare_num_token(op) != 0)
> +		return -1;
> +
>  	if (!xrow_update_op_is_term(op)) {
>  		item = xrow_update_array_extract_item(field, op);
>  		if (item == NULL)
>  			return -1;
> +		op->token_type = JSON_TOKEN_END;
>  		return xrow_update_op_do_field_insert(op, &item->field);
>  	}
>  
> @@ -248,6 +278,9 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
>  {
>  	assert(field->type == XUPDATE_ARRAY);
>  	struct xrow_update_rope *rope = field->array.rope;
> +	if (xrow_update_op_prepare_num_token(op) != 0)
> +		return -1;
> +
>  	/* Interpret '=' for n + 1 field as insert. */
>  	if (op->field_no == (int32_t) xrow_update_rope_size(rope))
>  		return xrow_update_op_do_array_insert(op, field);
> @@ -256,8 +289,10 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
>  		xrow_update_array_extract_item(field, op);
>  	if (item == NULL)
>  		return -1;
> -	if (!xrow_update_op_is_term(op))
> +	if (!xrow_update_op_is_term(op)) {
> +		op->token_type = JSON_TOKEN_END;
>  		return xrow_update_op_do_field_set(op, &item->field);
> +	}
>  	op->new_field_len = op->arg.set.length;
>  	/*
>  	 * Ignore the previous op, if any. It is not correct,
> @@ -275,11 +310,15 @@ xrow_update_op_do_array_delete(struct xrow_update_op *op,
>  			       struct xrow_update_field *field)
>  {
>  	assert(field->type == XUPDATE_ARRAY);
> +	if (xrow_update_op_prepare_num_token(op) != 0)
> +		return -1;
> +
>  	if (!xrow_update_op_is_term(op)) {
>  		struct xrow_update_array_item *item =
>  			xrow_update_array_extract_item(field, op);
>  		if (item == NULL)
>  			return -1;
> +		op->token_type = JSON_TOKEN_END;
>  		return xrow_update_op_do_field_delete(op, &item->field);
>  	}
>  
> @@ -301,12 +340,16 @@ int										\
>  xrow_update_op_do_array_##op_type(struct xrow_update_op *op,			\
>  				  struct xrow_update_field *field)		\
>  {										\
> +	if (xrow_update_op_prepare_num_token(op) != 0)				\
> +		return -1;							\
>  	struct xrow_update_array_item *item =					\
>  		xrow_update_array_extract_item(field, op);			\
>  	if (item == NULL)							\
>  		return -1;							\
> -	if (!xrow_update_op_is_term(op))					\
> +	if (!xrow_update_op_is_term(op)) {					\
> +		op->token_type = JSON_TOKEN_END;				\
>  		return xrow_update_op_do_field_##op_type(op, &item->field);	\
> +	}									\
>  	if (item->field.type != XUPDATE_NOP)					\
>  		return xrow_update_err_double(op);				\
>  	if (xrow_update_op_do_##op_type(op, item->field.data) != 0)		\
> diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
> index de865a21d..96fcaf747 100644
> --- a/src/box/xrow_update_field.c
> +++ b/src/box/xrow_update_field.c
> @@ -611,6 +611,22 @@ xrow_update_op_by(char opcode)
>  	}
>  }
>  
> +int
> +xrow_update_op_consume_token(struct xrow_update_op *op)
> +{
> +	struct json_token token;
> +	int rc = json_lexer_next_token(&op->lexer, &token);
> +	if (rc != 0)
> +		return xrow_update_err_bad_json(op, rc);
> +	if (token.type == JSON_TOKEN_END)
> +		return xrow_update_err_no_such_field(op);
> +	op->token_type = token.type;
> +	op->key = token.str;
> +	op->key_len = token.len;
> +	op->field_no = token.num;
> +	return 0;
> +}
> +
>  int
>  xrow_update_op_decode(struct xrow_update_op *op, int index_base,
>  		      struct tuple_dictionary *dict, const char **expr)
> @@ -639,6 +655,12 @@ xrow_update_op_decode(struct xrow_update_op *op, int index_base,
>  		diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
>  		return -1;
>  	}
> +	/*
> +	 * First token is always num. Even if a user specified a
> +	 * field name it is converted to num by the tuple
> +	 * dictionary.
> +	 */
> +	op->token_type = JSON_TOKEN_NUM;
>  	int32_t field_no = 0;
>  	switch(mp_typeof(**expr)) {
>  	case MP_INT:
> diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
> index bda9222cc..fbaf45c5d 100644
> --- a/src/box/xrow_update_field.h
> +++ b/src/box/xrow_update_field.h
> @@ -179,8 +179,18 @@ struct xrow_update_op {
>  	const struct xrow_update_op_meta *meta;
>  	/** Operation arguments. */
>  	union xrow_update_arg arg;
> -	/** First level field no. */
> -	int32_t field_no;
> +	/**
> +	 * Current level token. END means that it is invalid and
> +	 * a next token should be extracted from the lexer.
> +	 */
> +	enum json_token_type token_type;
> +	union {
> +		struct {
> +			const char *key;
> +			uint32_t key_len;
> +		};
> +		int32_t field_no;
> +	};
>  	/** Size of a new field after it is updated. */
>  	uint32_t new_field_len;
>  	/** Opcode symbol: = + - / ... */
> @@ -193,6 +203,17 @@ struct xrow_update_op {
>  	struct json_lexer lexer;
>  };
>  
> +/**
> + * Extract a next token from the operation path lexer. The result
> + * is used to decide to which child of a current map/array the
> + * operation should be forwarded. It is not just a synonym to
> + * json_lexer_next_token, because fills some fields of @a op,
> + * and should be used only to chose a next child inside a current
> + * map/array.
> + */
> +int
> +xrow_update_op_consume_token(struct xrow_update_op *op);
> +
>  /**
>   * Decode an update operation from MessagePack.
>   * @param[out] op Update operation.
> -- 
> 2.21.0 (Apple Git-122.2)
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable
  2019-12-26 12:07   ` Sergey Ostanevich
@ 2019-12-27 13:00     ` Vladislav Shpilevoy
  2019-12-27 14:59       ` Sergey Ostanevich
  0 siblings, 1 reply; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-27 13:00 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

Hi! Thanks for the review!

On 26/12/2019 13:07, Sergey Ostanevich wrote:
> Hi!
> 
> Thanks for the patch! 
> Just one comment below, otherwise LGTM.
> 
> Sergos
> 
> On 23 Dec 23:41, Vladislav Shpilevoy wrote:
>> There is a case: [1][2][3][4] = 100. It is not a problem when it
>> is a single operation, not intersecting with anything. It is an
>> isolated update then, and works ok. But the next patch allows
>> several update operations have the same prefix, and the path
>> [1][2][3][4] can become a tree of updated arrays. For example, a
>> trivial tree like this:
>>
>>     root: [ [1] ]
>>              |
>>              [ [1] [2] ]
>>                     |
>>                     [ [1] [2] [3] ]
>>                                |
>>                                [ [1] [2] [3] [4] ]
>>                                              =100
>>
>> When the update is applied to root, the JSON path [1][2][3][4]
>> is decoded one part by one. And the operation goes down the tree
>> until reaches the leaf, where [4] = 100 is applied. Each time when
>> the update goes one level down, somebody should update
>> xrow_update_op.field_no so as on the first level it would be 1,
>> then 2, 3, 4.
>>
>> Does it mean that each level of the update [1][2][3][4] should
>> prepare field_no for the next child? No, because then they would
>> need to check type of the child if it is an array or map, or
>> whatever expects a valid field_no/key in xrow_update_op, and
>> ensure that map-child gets a key, array-child gets a field_no.
>> That would complicate the code to a totally unreadable
>> state, and would break encapsulation between
>> xrow_update_array/map/bar... . Each array update operation would
>> check a child for all existing types to ensure that the next token
>> matches it. The same would happen to map updates.
>>
>> This patch goes another way - let each level of update check if
>> its field_no/key is already prepared by the caller. And if not,
>> extract a next token from the operation path. So the map update
>> will ensure that it has a valid key, an array update will ensure
>> that it has a valid field no.
>>
>> Part of #1261
>> ---
>>  src/box/xrow_update_array.c | 47 +++++++++++++++++++++++++++++++++++--
>>  src/box/xrow_update_field.c | 22 +++++++++++++++++
>>  src/box/xrow_update_field.h | 25 ++++++++++++++++++--
>>  3 files changed, 90 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
>> index 57427e39c..6a7ce09ff 100644
>> --- a/src/box/xrow_update_array.c
>> +++ b/src/box/xrow_update_array.c
>> @@ -32,6 +32,31 @@
>>  #include "msgpuck.h"
>>  #include "fiber.h"
>>  
>> +/**
>> + * Make sure @a op contains a valid field number to where the
>> + * operation should be applied next. Field number may be not
>> + * known, if the array's parent didn't propagate operation's
>> + * lexer. In fact, the parent fills fieldno only in some rare
>> + * cases like branching. Generally, an array should care about
>> + * fieldno by itself.
>> + */
>> +static inline int
>> +xrow_update_op_prepare_num_token(struct xrow_update_op *op)
>> +{
>> +	/*
>> +	 * Token type END is a special value meaning that the
>> +	 * current token needs to be parsed.
>> +	 */
> 
> I really don't like reuse of entities introduced for some purpose. If
> one will read this code, then it will be hard to understand why out of
> nowhere Lexer forced to come to an end at
> xrow_update_op_do_array_insert() for example. More that that - if lexer
> in a state of 'reached the end' it forced to parse. 
> 
> If we introduce some more functionality for JSON later - I expect we
> will face even more logic to hang on this 'special value'. Then
> contexts of these functionalities may intersect... oh my...
> 
> Can we introduce a new type with some self-explanatory name instead and
> leave the JSON_TOKEN_END on its own? 

Agree. But I don't know how to justify introduction of a new token
type to the json library. I was thinking about something like
JSON_TOKEN_INVALID, but 1) json library reports an error via return
values, not via token types, 2) 'invalid' token assumes an error,
not an already parsed token, like I need in my patch.

A value like JSON_TOKEN_CONSUMED/USED looks too specific for my case
to be added to the core lib.

So I added a flag to xrow_update_op. See diff below, and the whole
patch in the end.

================================================================================

diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
index 2eb98e8e9..e6b9ed89f 100644
--- a/src/box/xrow_update_array.c
+++ b/src/box/xrow_update_array.c
@@ -43,13 +43,8 @@
 static inline int
 xrow_update_op_prepare_num_token(struct xrow_update_op *op)
 {
-	/*
-	 * Token type END is a special value meaning that the
-	 * current token needs to be parsed.
-	 */
-	if (op->token_type == JSON_TOKEN_END &&
-	    xrow_update_op_consume_token(op) != 0)
-			return -1;
+	if (op->is_token_consumed && xrow_update_op_next_token(op) != 0)
+		return -1;
 	if (op->token_type != JSON_TOKEN_NUM) {
 		return xrow_update_err(op, "can't update an array by a "\
 				       "non-numeric index");
@@ -64,7 +59,7 @@ xrow_update_op_prepare_num_token(struct xrow_update_op *op)
 static inline int
 xrow_update_op_adjust_field_no(struct xrow_update_op *op, int32_t field_count)
 {
-	assert(op->token_type == JSON_TOKEN_NUM);
+	assert(op->token_type == JSON_TOKEN_NUM && !op->is_token_consumed);
 	if (op->field_no >= 0) {
 		if (op->field_no < field_count)
 			return 0;
@@ -254,7 +249,7 @@ xrow_update_op_do_array_insert(struct xrow_update_op *op,
 		item = xrow_update_array_extract_item(field, op);
 		if (item == NULL)
 			return -1;
-		op->token_type = JSON_TOKEN_END;
+		op->is_token_consumed = true;
 		return xrow_update_op_do_field_insert(op, &item->field);
 	}
 
@@ -290,7 +285,7 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
 	if (item == NULL)
 		return -1;
 	if (!xrow_update_op_is_term(op)) {
-		op->token_type = JSON_TOKEN_END;
+		op->is_token_consumed = true;
 		return xrow_update_op_do_field_set(op, &item->field);
 	}
 	op->new_field_len = op->arg.set.length;
@@ -318,7 +313,7 @@ xrow_update_op_do_array_delete(struct xrow_update_op *op,
 			xrow_update_array_extract_item(field, op);
 		if (item == NULL)
 			return -1;
-		op->token_type = JSON_TOKEN_END;
+		op->is_token_consumed = true;
 		return xrow_update_op_do_field_delete(op, &item->field);
 	}
 
@@ -347,7 +342,7 @@ xrow_update_op_do_array_##op_type(struct xrow_update_op *op,			\
 	if (item == NULL)							\
 		return -1;							\
 	if (!xrow_update_op_is_term(op)) {					\
-		op->token_type = JSON_TOKEN_END;				\
+		op->is_token_consumed = true;					\
 		return xrow_update_op_do_field_##op_type(op, &item->field);	\
 	}									\
 	if (item->field.type != XUPDATE_NOP)					\
diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
index 96fcaf747..460672b4c 100644
--- a/src/box/xrow_update_field.c
+++ b/src/box/xrow_update_field.c
@@ -612,7 +612,7 @@ xrow_update_op_by(char opcode)
 }
 
 int
-xrow_update_op_consume_token(struct xrow_update_op *op)
+xrow_update_op_next_token(struct xrow_update_op *op)
 {
 	struct json_token token;
 	int rc = json_lexer_next_token(&op->lexer, &token);
@@ -620,6 +620,7 @@ xrow_update_op_consume_token(struct xrow_update_op *op)
 		return xrow_update_err_bad_json(op, rc);
 	if (token.type == JSON_TOKEN_END)
 		return xrow_update_err_no_such_field(op);
+	op->is_token_consumed = false;
 	op->token_type = token.type;
 	op->key = token.str;
 	op->key_len = token.len;
@@ -661,6 +662,7 @@ xrow_update_op_decode(struct xrow_update_op *op, int index_base,
 	 * dictionary.
 	 */
 	op->token_type = JSON_TOKEN_NUM;
+	op->is_token_consumed = false;
 	int32_t field_no = 0;
 	switch(mp_typeof(**expr)) {
 	case MP_INT:
diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
index fbaf45c5d..0b0f608fe 100644
--- a/src/box/xrow_update_field.h
+++ b/src/box/xrow_update_field.h
@@ -179,11 +179,16 @@ struct xrow_update_op {
 	const struct xrow_update_op_meta *meta;
 	/** Operation arguments. */
 	union xrow_update_arg arg;
+	/** Current level token. */
+	enum json_token_type token_type;
 	/**
-	 * Current level token. END means that it is invalid and
-	 * a next token should be extracted from the lexer.
+	 * The flag says whether the token is already consumed by
+	 * the update operation during its forwarding down the
+	 * update tree. When the flag is true, it means that the
+	 * next node of the update tree will need to fetch a next
+	 * token from the lexer.
 	 */
-	enum json_token_type token_type;
+	bool is_token_consumed;
 	union {
 		struct {
 			const char *key;
@@ -212,7 +217,7 @@ struct xrow_update_op {
  * map/array.
  */
 int
-xrow_update_op_consume_token(struct xrow_update_op *op);
+xrow_update_op_next_token(struct xrow_update_op *op);
 
 /**
  * Decode an update operation from MessagePack.

================================================================================

The whole patch:

================================================================================

    tuple: make update operation tokens consumable
    
    There is a case: [1][2][3][4] = 100. It is not a problem when it
    is a single operation, not intersecting with anything. It is an
    isolated update then, and works ok. But the next patch allows
    several update operations have the same prefix, and the path
    [1][2][3][4] can become a tree of updated arrays. For example, a
    trivial tree like this:
    
        root: [ [1] ]
                 |
                 [ [1] [2] ]
                        |
                        [ [1] [2] [3] ]
                                   |
                                   [ [1] [2] [3] [4] ]
                                                 =100
    
    When the update is applied to root, the JSON path [1][2][3][4]
    is decoded one part by one. And the operation goes down the tree
    until reaches the leaf, where [4] = 100 is applied. Each time when
    the update goes one level down, somebody should update
    xrow_update_op.field_no so as on the first level it would be 1,
    then 2, 3, 4.
    
    Does it mean that each level of the update [1][2][3][4] should
    prepare field_no for the next child? No, because then they would
    need to check type of the child if it is an array or map, or
    whatever expects a valid field_no/key in xrow_update_op, and
    ensure that map-child gets a key, array-child gets a field_no.
    That would complicate the code to a totally unreadable
    state, and would break encapsulation between
    xrow_update_array/map/bar... . Each array update operation would
    check a child for all existing types to ensure that the next token
    matches it. The same would happen to map updates.
    
    This patch goes another way - let each level of update check if
    its field_no/key is already prepared by the caller. And if not,
    extract a next token from the operation path. So the map update
    will ensure that it has a valid key, an array update will ensure
    that it has a valid field no.
    
    Part of #1261

diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
index 57427e39c..e6b9ed89f 100644
--- a/src/box/xrow_update_array.c
+++ b/src/box/xrow_update_array.c
@@ -32,6 +32,26 @@
 #include "msgpuck.h"
 #include "fiber.h"
 
+/**
+ * Make sure @a op contains a valid field number to where the
+ * operation should be applied next. Field number may be not
+ * known, if the array's parent didn't propagate operation's
+ * lexer. In fact, the parent fills fieldno only in some rare
+ * cases like branching. Generally, an array should care about
+ * fieldno by itself.
+ */
+static inline int
+xrow_update_op_prepare_num_token(struct xrow_update_op *op)
+{
+	if (op->is_token_consumed && xrow_update_op_next_token(op) != 0)
+		return -1;
+	if (op->token_type != JSON_TOKEN_NUM) {
+		return xrow_update_err(op, "can't update an array by a "\
+				       "non-numeric index");
+	}
+	return 0;
+}
+
 /**
  * Make field index non-negative and check for the field
  * existence.
@@ -39,6 +59,7 @@
 static inline int
 xrow_update_op_adjust_field_no(struct xrow_update_op *op, int32_t field_count)
 {
+	assert(op->token_type == JSON_TOKEN_NUM && !op->is_token_consumed);
 	if (op->field_no >= 0) {
 		if (op->field_no < field_count)
 			return 0;
@@ -221,10 +242,14 @@ xrow_update_op_do_array_insert(struct xrow_update_op *op,
 {
 	assert(field->type == XUPDATE_ARRAY);
 	struct xrow_update_array_item *item;
+	if (xrow_update_op_prepare_num_token(op) != 0)
+		return -1;
+
 	if (!xrow_update_op_is_term(op)) {
 		item = xrow_update_array_extract_item(field, op);
 		if (item == NULL)
 			return -1;
+		op->is_token_consumed = true;
 		return xrow_update_op_do_field_insert(op, &item->field);
 	}
 
@@ -248,6 +273,9 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
 {
 	assert(field->type == XUPDATE_ARRAY);
 	struct xrow_update_rope *rope = field->array.rope;
+	if (xrow_update_op_prepare_num_token(op) != 0)
+		return -1;
+
 	/* Interpret '=' for n + 1 field as insert. */
 	if (op->field_no == (int32_t) xrow_update_rope_size(rope))
 		return xrow_update_op_do_array_insert(op, field);
@@ -256,8 +284,10 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
 		xrow_update_array_extract_item(field, op);
 	if (item == NULL)
 		return -1;
-	if (!xrow_update_op_is_term(op))
+	if (!xrow_update_op_is_term(op)) {
+		op->is_token_consumed = true;
 		return xrow_update_op_do_field_set(op, &item->field);
+	}
 	op->new_field_len = op->arg.set.length;
 	/*
 	 * Ignore the previous op, if any. It is not correct,
@@ -275,11 +305,15 @@ xrow_update_op_do_array_delete(struct xrow_update_op *op,
 			       struct xrow_update_field *field)
 {
 	assert(field->type == XUPDATE_ARRAY);
+	if (xrow_update_op_prepare_num_token(op) != 0)
+		return -1;
+
 	if (!xrow_update_op_is_term(op)) {
 		struct xrow_update_array_item *item =
 			xrow_update_array_extract_item(field, op);
 		if (item == NULL)
 			return -1;
+		op->is_token_consumed = true;
 		return xrow_update_op_do_field_delete(op, &item->field);
 	}
 
@@ -301,12 +335,16 @@ int										\
 xrow_update_op_do_array_##op_type(struct xrow_update_op *op,			\
 				  struct xrow_update_field *field)		\
 {										\
+	if (xrow_update_op_prepare_num_token(op) != 0)				\
+		return -1;							\
 	struct xrow_update_array_item *item =					\
 		xrow_update_array_extract_item(field, op);			\
 	if (item == NULL)							\
 		return -1;							\
-	if (!xrow_update_op_is_term(op))					\
+	if (!xrow_update_op_is_term(op)) {					\
+		op->is_token_consumed = true;					\
 		return xrow_update_op_do_field_##op_type(op, &item->field);	\
+	}									\
 	if (item->field.type != XUPDATE_NOP)					\
 		return xrow_update_err_double(op);				\
 	if (xrow_update_op_do_##op_type(op, item->field.data) != 0)		\
diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
index de865a21d..460672b4c 100644
--- a/src/box/xrow_update_field.c
+++ b/src/box/xrow_update_field.c
@@ -611,6 +611,23 @@ xrow_update_op_by(char opcode)
 	}
 }
 
+int
+xrow_update_op_next_token(struct xrow_update_op *op)
+{
+	struct json_token token;
+	int rc = json_lexer_next_token(&op->lexer, &token);
+	if (rc != 0)
+		return xrow_update_err_bad_json(op, rc);
+	if (token.type == JSON_TOKEN_END)
+		return xrow_update_err_no_such_field(op);
+	op->is_token_consumed = false;
+	op->token_type = token.type;
+	op->key = token.str;
+	op->key_len = token.len;
+	op->field_no = token.num;
+	return 0;
+}
+
 int
 xrow_update_op_decode(struct xrow_update_op *op, int index_base,
 		      struct tuple_dictionary *dict, const char **expr)
@@ -639,6 +656,13 @@ xrow_update_op_decode(struct xrow_update_op *op, int index_base,
 		diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
 		return -1;
 	}
+	/*
+	 * First token is always num. Even if a user specified a
+	 * field name it is converted to num by the tuple
+	 * dictionary.
+	 */
+	op->token_type = JSON_TOKEN_NUM;
+	op->is_token_consumed = false;
 	int32_t field_no = 0;
 	switch(mp_typeof(**expr)) {
 	case MP_INT:
diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
index bda9222cc..0b0f608fe 100644
--- a/src/box/xrow_update_field.h
+++ b/src/box/xrow_update_field.h
@@ -179,8 +179,23 @@ struct xrow_update_op {
 	const struct xrow_update_op_meta *meta;
 	/** Operation arguments. */
 	union xrow_update_arg arg;
-	/** First level field no. */
-	int32_t field_no;
+	/** Current level token. */
+	enum json_token_type token_type;
+	/**
+	 * The flag says whether the token is already consumed by
+	 * the update operation during its forwarding down the
+	 * update tree. When the flag is true, it means that the
+	 * next node of the update tree will need to fetch a next
+	 * token from the lexer.
+	 */
+	bool is_token_consumed;
+	union {
+		struct {
+			const char *key;
+			uint32_t key_len;
+		};
+		int32_t field_no;
+	};
 	/** Size of a new field after it is updated. */
 	uint32_t new_field_len;
 	/** Opcode symbol: = + - / ... */
@@ -193,6 +208,17 @@ struct xrow_update_op {
 	struct json_lexer lexer;
 };
 
+/**
+ * Extract a next token from the operation path lexer. The result
+ * is used to decide to which child of a current map/array the
+ * operation should be forwarded. It is not just a synonym to
+ * json_lexer_next_token, because fills some fields of @a op,
+ * and should be used only to chose a next child inside a current
+ * map/array.
+ */
+int
+xrow_update_op_next_token(struct xrow_update_op *op);
+
 /**
  * Decode an update operation from MessagePack.
  * @param[out] op Update operation.

================================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] tuple: JSON path update intersection at arrays
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 2/3] tuple: JSON path update intersection at arrays Vladislav Shpilevoy
@ 2019-12-27 14:13   ` Sergey Ostanevich
  2019-12-27 15:52     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 13+ messages in thread
From: Sergey Ostanevich @ 2019-12-27 14:13 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hi!

Thanks for the patch, just a couple of comments below, LGTM

Sergos

On 23 Dec 23:41, Vladislav Shpilevoy wrote:
> Before the patch only isolated JSON updates were supported as the
> simplest and fastest to implement. This patch allows update
> operations with paths having the same prefix. But difference of
> the paths should start from an array index.
> 
> For example, this is allowed:
> 
>     [1][2][3].a.b.c = 20
>     [1][2][4].e.f.g = 30
> 
>     First difference is [3] vs [4] - intersection by an array, ok.
> 
> This is not allowed yet:
> 
>     [1][2][3].a.b.c = 20
>     [1][2][3].a.e.f = 30
> 
>     First difference is 'b' vs 'e' - intersection by a map,
>     not ok.
> 
> For that a new update tree node type is added - XUPDATE_ROUTE.
> When several update operations have the same prefix, this prefix
> becomes an XUPDATE_ROUTE tree field. It stores the prefix and a
> subtree with these operations.
> 
> Bar and route update nodes can branch and produce more bars and
> routes, when new operations come.
> 
> Part of #1261
> ---
>  src/box/CMakeLists.txt      |   1 +
>  src/box/xrow_update_array.c |  50 ++++++
>  src/box/xrow_update_bar.c   |  14 +-
>  src/box/xrow_update_field.c |   4 +
>  src/box/xrow_update_field.h |  99 +++++++++++
>  src/box/xrow_update_route.c | 346 ++++++++++++++++++++++++++++++++++++
>  test/box/update.result      | 177 ++++++++++++++++++
>  test/box/update.test.lua    |  91 ++++++++++
>  8 files changed, 772 insertions(+), 10 deletions(-)
>  create mode 100644 src/box/xrow_update_route.c
> 
> diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
> index fc9d1a3e8..f99efc503 100644
> --- a/src/box/CMakeLists.txt
> +++ b/src/box/CMakeLists.txt
> @@ -44,6 +44,7 @@ add_library(tuple STATIC
>      xrow_update_field.c
>      xrow_update_array.c
>      xrow_update_bar.c
> +    xrow_update_route.c
>      tuple_compare.cc
>      tuple_extract_key.cc
>      tuple_hash.cc
> diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
> index 6a7ce09ff..e49dad5c1 100644
> --- a/src/box/xrow_update_array.c
> +++ b/src/box/xrow_update_array.c
> @@ -195,6 +195,56 @@ xrow_update_array_create(struct xrow_update_field *field, const char *header,
>  	return xrow_update_rope_append(field->array.rope, item, field_count);
>  }
>  
> +int
> +xrow_update_array_create_with_child(struct xrow_update_field *field,
> +				    const char *header,
> +				    const struct xrow_update_field *child,
> +				    int32_t field_no)
> +{
> +	const char *data = header;
> +	uint32_t field_count = mp_decode_array(&data);
> +	const char *first_field = data;
> +	const char *first_field_end = first_field;
> +	mp_next(&first_field_end);
> +	struct region *region = &fiber()->gc;
> +	struct xrow_update_rope *rope = xrow_update_rope_new(region);
> +	if (rope == NULL)
> +		return -1;
> +	struct xrow_update_array_item *item = (struct xrow_update_array_item *)
> +		xrow_update_alloc(region, sizeof(*item));
> +	if (item == NULL)
> +		return -1;
> +	const char *end = first_field_end;
> +	if (field_no > 0) {
> +		for (int32_t i = 1; i < field_no; ++i)
> +			mp_next(&end);
> +		xrow_update_array_item_create(item, XUPDATE_NOP, first_field,
> +					      first_field_end - first_field,
> +					      end - first_field_end);
> +		if (xrow_update_rope_append(rope, item, field_no) != 0)
> +			return -1;
> +		item = (struct xrow_update_array_item *)
> +			xrow_update_alloc(region, sizeof(*item));
> +		if (item == NULL)
> +			return -1;
> +		first_field = end;
> +		first_field_end = first_field;
> +		mp_next(&first_field_end);
> +		end = first_field_end;
> +	}
> +	for (uint32_t i = field_no + 1; i < field_count; ++i)
> +		mp_next(&end);
> +	item->field = *child;
> +	xrow_update_array_item_create(item, child->type, first_field,
> +				      first_field_end - first_field,
> +				      end - first_field_end);
> +	field->type = XUPDATE_ARRAY;
> +	field->data = header;
> +	field->size = end - header;
> +	field->array.rope = rope;
> +	return xrow_update_rope_append(rope, item, field_count - field_no);
> +}
> +
>  uint32_t
>  xrow_update_array_sizeof(struct xrow_update_field *field)
>  {
> diff --git a/src/box/xrow_update_bar.c b/src/box/xrow_update_bar.c
> index 7285c7e2b..1c2d2ef99 100644
> --- a/src/box/xrow_update_bar.c
> +++ b/src/box/xrow_update_bar.c
> @@ -307,17 +307,11 @@ int										\
>  xrow_update_op_do_bar_##op_type(struct xrow_update_op *op,			\
>  				struct xrow_update_field *field)		\
>  {										\
> -	/*									\
> -	 * The only way to update a bar is to make a second update		\
> -	 * with the same prefix as this bar. But it is not			\
> -	 * supported yet.							\
> -	 */									\
> -	(void) op;								\
> -	(void) field;								\
>  	assert(field->type == XUPDATE_BAR);					\
> -	diag_set(ClientError, ER_UNSUPPORTED, "update",				\
> -		 "intersected JSON paths");					\
> -	return -1;								\
> +	field = xrow_update_route_branch(field, op);				\
> +	if (field == NULL)							\
> +		return -1;							\
> +	return xrow_update_op_do_field_##op_type(op, field);			\
>  }
>  
>  DO_BAR_OP_GENERIC(insert)
> diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
> index 96fcaf747..3a4aa6c90 100644
> --- a/src/box/xrow_update_field.c
> +++ b/src/box/xrow_update_field.c
> @@ -114,6 +114,8 @@ xrow_update_field_sizeof(struct xrow_update_field *field)
>  		return xrow_update_array_sizeof(field);
>  	case XUPDATE_BAR:
>  		return xrow_update_bar_sizeof(field);
> +	case XUPDATE_ROUTE:
> +		return xrow_update_route_sizeof(field);
>  	default:
>  		unreachable();
>  	}
> @@ -141,6 +143,8 @@ xrow_update_field_store(struct xrow_update_field *field, char *out,
>  		return xrow_update_array_store(field, out, out_end);
>  	case XUPDATE_BAR:
>  		return xrow_update_bar_store(field, out, out_end);
> +	case XUPDATE_ROUTE:
> +		return xrow_update_route_store(field, out, out_end);
>  	default:
>  		unreachable();
>  	}
> diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
> index fbaf45c5d..f07b5a649 100644
> --- a/src/box/xrow_update_field.h
> +++ b/src/box/xrow_update_field.h
> @@ -275,6 +275,23 @@ enum xrow_update_type {
>  	 * path nodes. And this is the most common case.
>  	 */
>  	XUPDATE_BAR,
> +	/**
> +	 * Field with a subtree of updates having the same prefix
> +	 * stored here explicitly. New updates with the same
> +	 * prefix just follow it without decoding of JSON nor
> +	 * MessagePack. It can be quite helpful when an update
> +	 * works with the same internal object via several
> +	 * operations like this:
> +	 *
> +	 *    [1][2].a.b.c[1] = 20
> +	 *    [1][2].a.b.c[2] = 30
> +	 *    [1][2].a.b.c[3] = true
> +	 *
> +	 * Here [1][2].a.b.c is stored only once as a route with a
> +	 * child XUPDATE_ARRAY making updates '[1] = 20',
> +	 * '[2] = 30', '[3] = true'.
> +	 */
> +	XUPDATE_ROUTE,
>  };
>  
>  /**
> @@ -355,6 +372,17 @@ struct xrow_update_field {
>  				};
>  			};
>  		} bar;
> +		/** Route update - path to an update subtree. */
> +		struct {
> +			/**
> +			 * Common prefix of all updates in the
> +			 * subtree.
> +			 */
> +			const char *path;
> +			int path_len;
> +			/** Update subtree. */
> +			struct xrow_update_field *next_hop;
> +		} route;
>  	};
>  };
>  
> @@ -447,6 +475,33 @@ xrow_update_array_create(struct xrow_update_field *field, const char *header,
>  			 const char *data, const char *data_end,
>  			 uint32_t field_count);
>  
> +/**
> + * The same as mere create, but the array is created with a first
> + * child right away. It allows to make it more efficient, because
> + * under the hood a rope is created not as a one huge range which
> + * is then spit in parts, but as two rope nodes from the
              ^^^^
Should be     split here?

> + * beginning. On the summary: -1 rope node split, -1 decoding of
> + * fields from 0 to @a field_no.
> + *
> + * The function is used during branching, where there was an
> + * existing update, but another one came with the same prefix, and
> + * a different suffix.
> + *
> + * @param[out] field Field to initialize.
> + * @param header Header of the MessagePack array.
> + * @param child A child subtree. The child is copied by value into
> + *        the created array.
> + * @param field_no Field number of @a child.
> + *
> + * @retval  0 Success.
> + * @retval -1 Error.
> + */
> +int
> +xrow_update_array_create_with_child(struct xrow_update_field *field,
> +				    const char *header,
> +				    const struct xrow_update_field *child,
> +				    int32_t field_no);
> +
>  OP_DECL_GENERIC(array)
>  
>  /* }}} xrow_update_field.array */
> @@ -463,6 +518,37 @@ OP_DECL_GENERIC(nop)
>  
>  /* }}} update_field.nop */
>  
> +/* {{{ xrow_update_field.route */
> +
> +/**
> + * Take a bar or a route @a field and split its path in the place
> + * where @a new_op should be applied. Prefix becomes a new route
> + * object, suffix becomes a child of the result route. In the
> + * result @a field stays root of its subtree, and a node of that
> + * subtree is returned, to which @a new_op should be applied.
> + *
> + * Note, this function does not apply @a new_op. It just finds to
> + * where it *should be* applied and does all preparations. It is
> + * done deliberately, because otherwise do_cb() virtual function
> + * of @a new_op would have been called here, since there is no
> + * context. But a caller always knows exactly if it was insert,
> + * set, arith, etc. And a caller can and does use more specific
> + * function like xrow_update_op_do_field_set/insert/... .
> + *
> + * @param field Field to find where to apply @a new_op.
> + * @param new_op New operation to apply.
> + *
> + * @retval not-NULL A field to which @a new_op should be applied.
> + * @retval NULL Error.
> + */
> +struct xrow_update_field *
> +xrow_update_route_branch(struct xrow_update_field *field,
> +			 struct xrow_update_op *new_op);
> +
> +OP_DECL_GENERIC(route)
> +
> +/* }}} xrow_update_field.route */
> +
>  #undef OP_DECL_GENERIC
>  
>  /* {{{ Common helpers. */
> @@ -475,6 +561,17 @@ OP_DECL_GENERIC(nop)
>   * xrow_update_op_do_field_set on its childs. Each child can be
>   * another array, a bar, a route, a map - anything. The functions
>   * below help to make such places shorter and simpler.
> + *
> + * Note, that they are recursive, although it is not clearly
> + * visible. For example, if an update tree contains several array
> + * nodes on one tree branch, then update of the deepest array goes
> + * through each of these nodes and calls
> + * xrow_update_op_do_array_<opname>() on needed children. But it
> + * is ok, because operation count is usually small (<<50 in the
> + * most cases, usually <= 5), and the update tree depth is not
> + * bigger than operation count. Also, fiber stack is big enough to
> + * fit ~10k update tree depth - incredible number, even though the
> + * real limit is 4k due to limited number of operations.
>   */
>  #define OP_DECL_GENERIC(op_type)						\
>  static inline int								\
> @@ -488,6 +585,8 @@ xrow_update_op_do_field_##op_type(struct xrow_update_op *op,			\
>  		return xrow_update_op_do_nop_##op_type(op, field);		\
>  	case XUPDATE_BAR:							\
>  		return xrow_update_op_do_bar_##op_type(op, field);		\
> +	case XUPDATE_ROUTE:							\
> +		return xrow_update_op_do_route_##op_type(op, field);		\
>  	default:								\
>  		unreachable();							\
>  	}									\
> diff --git a/src/box/xrow_update_route.c b/src/box/xrow_update_route.c
> new file mode 100644
> index 000000000..f374db837
> --- /dev/null
> +++ b/src/box/xrow_update_route.c
> @@ -0,0 +1,346 @@
> +/*
> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +#include "xrow_update_field.h"
> +#include "fiber.h"
> +#include "tuple.h"
> +
> +/**
> + * Do the actual branch. This is the case when an existing
> + * bar/route path is different from a new operation's path in an
> + * array. The existing object needs to be split into parent-child,
> + * and the new operation becomes a second child.
> + *
> + * @param[out] next_hop A field which will be initialized as an
> + *        array, and which will be a point to apply a new
> + *        operation.
> + * @param parent MessagePack array to be taken by @a next_hop.
> + * @param child Current field from which the branch happens. It
> + *        already contains an update subtree.
> + */
> +static int
> +xrow_update_route_branch_array(struct xrow_update_field *next_hop,
> +			       const char *parent,
> +			       const struct xrow_update_field *child,
> +			       int32_t field_no)
> +{
> +	struct xrow_update_op *op = child->bar.op;
> +	/*
> +	 * There are rules when a subtree can be just copied as is

I would name these as limitations, not rules. 

> +	 * from one parent to another.
> +	 * 1) It should not be a bar update. Because if it is not
> +	 *    a bar, then it is either scalar or an array/map.
> +	 *    Scalar update can be safely moved. Array/map update
> +	 *    doesn't change their parent, and also can be moved.
> +	 *    Otherwise see (2).
> +	 * 2) It is a bar. Then it should not be a leaf. If it is
> +	 *    not a leaf, then it does not change header and other
> +	 *    fields of this particular array, and can be safely
> +	 *    moved to somewhere else.
> +	 *    Otherwise see (1).
                            ^^^
                            (3) I suppose?
> +	 * 3) Ok, it is a bar, a leaf. Then it is a bar with zero
> +	 *    path length. It could degrade to zero path len in
> +	 *    during branching. In this case it should be a scalar
> +	 *    bar. The only non-scalar operations are '!' and '#'.
> +	 *
> +	 * Why '#' and '!' can't be moved? '!', for example, being
> +	 * applied to a field [1], affects all fields [2-*], and
> +	 * the array header. The same but in an even worse form
> +	 * about '#'. Such operations should be redone. They
> +	 * affect many fields and the parent.
> +	 *
> +	 * There is a tricky thing though - why not to just redo
> +	 * all operations here, for the sake of code simplicity?
> +	 * It would allow to remove 'create_with_child' crutch.
> +	 * The answer is - it is not possible. If a field is
> +	 * copyable, it is not re-applicable. And vice-versa. For
> +	 * example, if it is not a leaf, then there may be many
> +	 * operations, not one. A subtree just can't be
> +	 * 're-applied'.
> +	 *
> +	 * If the operation is scalar and a leaf, then its result
> +	 * has already overridden its arguments. This is because
> +	 * scalar operations save result into the arguments, to
> +	 * save memory. A second operation appliance would lead
> +	 * to very surprising results.
> +	 *
> +	 * Another reason - performance. This path should be
> +	 * quite hot, and to copy a struct is for sure much faster
> +	 * than to reapply an operation using a virtual function.
> +	 * Operations '!' and '#' are quite rare, so their
> +	 * optimization is not a critical goal.
> +	 */
> +	if (/* (1) Not a bar. */
> +	    child->type != XUPDATE_BAR ||
> +	    /* (2) Bar, but not a leaf. */
> +	    child->bar.path_len > 0 ||
> +	    /* (3) Leaf, bar, but a scalar operation. */
> +	    (op->opcode != '!' && op->opcode != '#')) {
> +		return xrow_update_array_create_with_child(next_hop, parent,
> +							   child, field_no);
> +	}
> +	/*
> +	 * Can't move the child. The only way to branch is to
> +	 * reapply the child's operation.
> +	 */
> +	op->token_type = JSON_TOKEN_NUM;
> +	op->field_no = field_no;
> +	const char *data = parent;
> +	uint32_t field_count = mp_decode_array(&data);
> +	const char *end = data;
> +	for (uint32_t i = 0; i < field_count; ++i)
> +		mp_next(&end);
> +	if (xrow_update_array_create(next_hop, parent, data, end,
> +				     field_count) != 0)
> +		return -1;
> +	return op->meta->do_op(op, next_hop);
> +}
> +
> +struct xrow_update_field *
> +xrow_update_route_branch(struct xrow_update_field *field,
> +			 struct xrow_update_op *new_op)
> +{
> +	assert(new_op->lexer.src != NULL);
> +	const char *old_path;
> +	int old_path_len;
> +	if (field->type == XUPDATE_BAR) {
> +		old_path = field->bar.path;
> +		old_path_len = field->bar.path_len;
> +	} else {
> +		assert(field->type == XUPDATE_ROUTE);
> +		old_path = field->route.path;
> +		old_path_len = field->route.path_len;
> +	}
> +	assert(old_path != NULL);
> +	struct json_lexer old_path_lexer;
> +	struct json_token old_token, new_token;
> +	/*
> +	 * Offset is going to be used as a length of the route
> +	 * node created as a parent of the old subtree and the
> +	 * new operation. As it is described in the route
> +	 * definition in struct xrow_update_field - route is the
> +	 * common prefix of all operations of the subtree. Here
> +	 * length of that prefix is calculated.
> +	 * In addition, it is used here to determine if the new
> +	 * operation is different from the current subtree from
> +	 * the very beginning. Basically, it means the offset is
> +	 * 0, and no route is generated. Root becomes a regular
> +	 * update field (array, map), not a route. Example:
> +	 * @a field is a bar '[1].a.b = 20', @a new_op is
> +	 * '[2].c.d = 30'. In this case the paths are already
> +	 * different from the beginning, no common prefix = no
> +	 * route. Array with children [1].a.b and [2].c.d becomes
> +	 * a root.
> +	 */
> +	int saved_old_offset;
> +	json_lexer_create(&old_path_lexer, old_path, old_path_len,
> +			  TUPLE_INDEX_BASE);
> +	const char *parent = field->data;
> +	do {
> +		saved_old_offset = old_path_lexer.offset;
> +		int rc = json_lexer_next_token(&old_path_lexer, &old_token);
> +		/* Old path is already validated. */
> +		assert(rc == 0);
> +		rc = json_lexer_next_token(&new_op->lexer, &new_token);
> +		if (rc != 0) {
> +			xrow_update_err_bad_json(new_op, rc);
> +			return NULL;
> +		}
> +		if (json_token_cmp(&old_token, &new_token) != 0)
> +			break;
> +		switch(new_token.type) {
> +		case JSON_TOKEN_NUM:
> +			rc = tuple_field_go_to_index(&parent, new_token.num);
> +			break;
> +		case JSON_TOKEN_STR:
> +			rc = tuple_field_go_to_key(&parent, new_token.str,
> +						   new_token.len);
> +			break;
> +		default:
> +			/*
> +			 * Can't be JSON_TOKEN_ANY, because old
> +			 * and new tokens are equal, but '*' is
> +			 * considered invalid and the old was
> +			 * already checked for that. So the new is
> +			 * valid too. And can't have type ANY.
> +			 */
> +			assert(new_token.type == JSON_TOKEN_END);
> +			xrow_update_err_double(new_op);
> +			return NULL;
> +		}
> +		/*
> +		 * Must always find a field, because the old
> +		 * token already went that path.
> +		 */
> +		assert(rc == 0);
> +	} while (true);
> +	enum mp_type type = mp_typeof(*parent);
> +	/*
> +	 * Ched if the paths are different from the very
           ^^^^
           check(ed)?
> +	 * beginning. It means, that the old field should be
> +	 * transformed instead of creating a new route node.
> +	 */
> +	bool transform_root = saved_old_offset == 0;

It would be nicer if you put comparison in () up there

> +	struct xrow_update_field *next_hop;
> +	if (! transform_root) {

No spaces after unaries, use sed -ie "s/(! /\(\!/"

> +		next_hop = (struct xrow_update_field *)
> +			region_alloc(&fiber()->gc, sizeof(*next_hop));
> +		if (next_hop == NULL) {
> +			diag_set(OutOfMemory, sizeof(*next_hop), "region_alloc",
> +				 "next_hop");
> +			return NULL;
> +		}
> +	} else {
> +		next_hop = field;
> +	}
> +
> +	int path_offset = old_path_lexer.offset;
> +	struct xrow_update_field child = *field;
> +	if (child.type == XUPDATE_ROUTE) {
> +		child.route.path += path_offset;
> +		child.route.path_len -= path_offset;
> +		if (child.route.path_len == 0)
> +			child = *child.route.next_hop;
> +	} else {
> +		assert(child.type == XUPDATE_BAR);
> +		child.bar.path += path_offset;
> +		child.bar.path_len -= path_offset;
> +		/*
> +		 * Yeah, bar length can become 0 here, but it is
> +		 * ok as long as it is a scalar operation (not '!'
> +		 * and not '#'). When a bar is scalar, it operates
> +		 * on one concrete field and works even if its
> +		 * path len is 0. Talking of '#' and '!' - they
> +		 * are handled by array and map 'branchers'
> +		 * internally, below. They reapply such
> +		 * operations.
> +		 */
> +	}
> +
> +	if (type == MP_ARRAY) {
> +		if (new_token.type != JSON_TOKEN_NUM) {
> +			xrow_update_err(new_op, "can not update array by "\
> +					"non-integer index");
> +			return NULL;
> +		}
> +		new_op->token_type = JSON_TOKEN_NUM;
> +		new_op->field_no = new_token.num;
> +		if (xrow_update_route_branch_array(next_hop, parent, &child,
> +						   old_token.num) != 0)
> +			return NULL;
> +	} else if (type == MP_MAP) {
> +		diag_set(ClientError, ER_UNSUPPORTED, "update",
> +			 "path intersection on map");
> +		return NULL;
> +	} else {
> +		xrow_update_err_no_such_field(new_op);
> +		return NULL;
> +	}
> +
> +	if (! transform_root) {
> +		field->type = XUPDATE_ROUTE;
> +		field->route.path = old_path;
> +		field->route.path_len = saved_old_offset;
> +		field->route.next_hop = next_hop;
> +	}
> +	return next_hop;
> +}
> +
> +/**
> + * Obtain a next node of the update tree to which @a op should be
> + * propagated. It is the same as branch, but has a fast path in
> + * case @a field is route, and operation prefix matches this
> + * route - then no need to parse JSON and dive into MessagePack,
> + * the route is just followed, via a lexer offset increase.
> + */
> +static struct xrow_update_field *
> +xrow_update_route_next(struct xrow_update_field *field, struct xrow_update_op *op)
> +{
> +	assert(field->type == XUPDATE_ROUTE);
> +	assert(! xrow_update_op_is_term(op));
> +	const char *new_path = op->lexer.src + op->lexer.offset;
> +	int new_path_len = op->lexer.src_len - op->lexer.offset;
> +	if (field->route.path_len <= new_path_len &&
> +	    memcmp(field->route.path, new_path, field->route.path_len) == 0) {
> +		/*
> +		 * Fast path: jump to the next hop with no
> +		 * decoding. Is used, when several JSON updates
> +		 * have the same prefix.
> +		 */
> +		op->lexer.offset += field->route.path_len;
> +		return field->route.next_hop;
> +	}
> +	return xrow_update_route_branch(field, op);
> +}
> +
> +#define DO_SCALAR_OP_GENERIC(op_type)						\
> +int										\
> +xrow_update_op_do_route_##op_type(struct xrow_update_op *op,			\
> +				  struct xrow_update_field *field)		\
> +{										\
> +	assert(field->type == XUPDATE_ROUTE);					\
> +	struct xrow_update_field *next_hop = xrow_update_route_next(field, op);	\
> +	if (next_hop == NULL)							\
> +		return -1;							\
> +	return xrow_update_op_do_field_##op_type(op, next_hop);					\
> +}
> +
> +DO_SCALAR_OP_GENERIC(set)
> +
> +DO_SCALAR_OP_GENERIC(insert)
> +
> +DO_SCALAR_OP_GENERIC(delete)
> +
> +DO_SCALAR_OP_GENERIC(arith)
> +
> +DO_SCALAR_OP_GENERIC(bit)
> +
> +DO_SCALAR_OP_GENERIC(splice)
> +
> +uint32_t
> +xrow_update_route_sizeof(struct xrow_update_field *field)
> +{
> +	return field->size - field->route.next_hop->size +
> +	       xrow_update_field_sizeof(field->route.next_hop);
> +}
> +
> +uint32_t
> +xrow_update_route_store(struct xrow_update_field *field, char *out,
> +			char *out_end)
> +{
> +	char *saved_out = out;
> +	int before_hop = field->route.next_hop->data - field->data;
> +	memcpy(out, field->data, before_hop);
> +	out += before_hop;
> +	out += xrow_update_field_store(field->route.next_hop, out, out_end);
> +	int after_hop = before_hop + field->route.next_hop->size;
> +	memcpy(out, field->data + after_hop, field->size - after_hop);
> +	return out + field->size - after_hop - saved_out;
> +}
> diff --git a/test/box/update.result b/test/box/update.result
> index ee38c100f..bf76f6638 100644
> --- a/test/box/update.result
> +++ b/test/box/update.result
> @@ -1294,6 +1294,183 @@ vy_s:select()
>  vy_s:drop()
>  ---
>  ...
> +--
> +-- Complex updates. Intersected paths, different types.
> +--
> +--
> +-- =
> +--
> +-- Difference in the last part.
> +t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][1][2]', 375}})
> +---
> +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
> +      'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[325, 375], [400,
> +        450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
> +...
> +-- Different starting from the non-last part.
> +t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][2][2]', 475}})
> +---
> +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4, 5, 6, 7, 8], 'e': 500,
> +      'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[325, 350], [400,
> +        475]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
> +...
> +-- Contains string keys in the middle.
> +t:update({{'=', '[2].c.f[1]', 4000}, {'=', '[2].c.f[2]', 5000}})
> +---
> +- [1, {'b': 200, 'm': true, 'a': 100, 'c': {'d': 400, 'f': [4000, 5000, 6, 7, 8],
> +      'e': 500, 'g': {'k': 600, 'l': 700}}, 'g': [800, 900]}, [100, 200, [[300, 350],
> +      [400, 450]], {'a': 500, 'b': 600}, {'c': 700, 'd': 800}]]
> +...
> +t = {1}
> +---
> +...
> +t[2] = setmetatable({}, {__serialize = 'map'})
> +---
> +...
> +t[3] = {}
> +---
> +...
> +t[4] = {                        \
> +    1,                          \
> +    2,                          \
> +    3,                          \
> +    {                           \
> +        4,                      \
> +        5,                      \
> +        6,                      \
> +        7,                      \
> +        {                       \
> +            8,                  \
> +            9,                  \
> +            {10, 11, 12},       \
> +            13,                 \
> +            14,                 \
> +        },                      \
> +        15,                     \
> +        16,                     \
> +        {17, 18, 19}            \
> +    },                          \
> +    20,                         \
> +    {                           \
> +        a = 21,                 \
> +        b = 22,                 \
> +        c = {                   \
> +            d = 23,             \
> +            e = 24              \
> +        }                       \
> +    },                          \
> +    25                          \
> +}
> +---
> +...
> +t = s:replace(t)
> +---
> +...
> +-- Non-first and non-last field updates.
> +t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][8][3]', 19000}})
> +---
> +- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11000, 12], 13, 14], 15, 16, [17,
> +        18, 19000]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +-- Triple intersection with enabled fast jump by route.
> +t:update({{'=', '[4][4][3]', 6000}, {'=', '[4][4][5][2]', 9000}, {'=', '[4][4][8][2]', 18000}})
> +---
> +- [1, {}, [], [1, 2, 3, [4, 5, 6000, 7, [8, 9000, [10, 11, 12], 13, 14], 15, 16, [
> +        17, 18000, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +-- Triple intersection with no optimizations.
> +t:update({{'=', '[4][3]', 3000}, {'=', '[4][4][2]', 5000}, {'=', '[4][4][5][1]', 8000}})
> +---
> +- [1, {}, [], [1, 2, 3000, [4, 5000, 6, 7, [8000, 9, [10, 11, 12], 13, 14], 15, 16,
> +      [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +-- Last operation splits previous ones.
> +t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][5][3][1]', 10000}, {'=', '[4][4][3]', 6000}})
> +---
> +- [1, {}, [], [1, 2, 3, [4, 5, 6000, 7, [8, 9, [10000, 11000, 12], 13, 14], 15, 16,
> +      [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +-- First two operations generate a route '[4]' in the 4th field of
> +-- the tuple. With childs [1] and [2]. The last operation should
> +-- delete the route and replace it with a full featured array.
> +t:update({{'=', '[4][4][1]', 4.5}, {'=', '[4][4][2]', 5.5}, {'=', '[4][3]', 3.5}})
> +---
> +- [1, {}, [], [1, 2, 3.5, [4.5, 5.5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [
> +        17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +-- Test errors.
> +t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][5].a', -1}})
> +---
> +- error: 'Field ''[4][4][5].a'' UPDATE error: can not update array by non-integer
> +    index'
> +...
> +t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][*].a', -1}})
> +---
> +- error: 'Field ''[4][4][*].a'' UPDATE error: can not update array by non-integer
> +    index'
> +...
> +t:update({{'=', '[4][4][*].b', 8000}, {'=', '[4][4][*].a', -1}})
> +---
> +- error: 'Field ''[4][4][*].b'' UPDATE error: invalid JSON in position 8'
> +...
> +t:update({{'=', '[4][4][1]', 4000}, {'=', '[4][4]-badjson', -1}})
> +---
> +- error: 'Field ''[4][4]-badjson'' UPDATE error: invalid JSON in position 7'
> +...
> +t:update({{'=', '[4][4][1]', 1}, {'=', '[4][4][1]', 2}})
> +---
> +- error: 'Field ''[4][4][1]'' UPDATE error: double update of the same field'
> +...
> +-- First two operations produce zero length bar update in field
> +-- [4][4][5][4]. The third operation should raise an error.
> +t:update({{'+', '[4][4][5][4]', 13000}, {'=', '[4][4][5][1]', 8000}, {'+', '[4][4][5][4]', 1}})
> +---
> +- error: 'Field ''[4][4][5][4]'' UPDATE error: double update of the same field'
> +...
> +--
> +-- !
> +--
> +t:update({{'!', '[4][5]', 19.5}, {'!', '[4][6]', 19.75}, {'!', '[4][4][3]', 5.5}, {'=', '[4][4][2]', 5.25}})
> +---
> +- [1, {}, [], [1, 2, 3, [4, 5.25, 5.5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16,
> +      [17, 18, 19]], 19.5, 19.75, 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}},
> +    25]]
> +...
> +--
> +-- #
> +--
> +t:update({{'#', '[4][4][5]', 1}, {'#', '[4][4][4]', 1}, {'#', '[4][4][5]', 1}})
> +---
> +- [1, {}, [], [1, 2, 3, [4, 5, 6, 15, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {
> +        'd': 23, 'e': 24}}, 25]]
> +...
> +--
> +-- Scalar operations.
> +--
> +t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}})
> +---
> +- [1, {}, [], [1, 2, 3.5, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14.75], 15, 16, [
> +        17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}, {'+', '[4][4][3]', 0.25}})
> +---
> +- [1, {}, [], [1, 2, 3.5, [4, 5, 6.25, 7, [8, 9, [10, 11, 12], 13, 14.75], 15, 16,
> +      [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +-- Combined - check that operations following '!' take into
> +-- account that there is a new element. Indexes should be relative
> +-- to the new array size.
> +t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]', 0.25}})
> +---
> +- [1, {}, [], [1, 2, 2.5, 3, [4, 5, 6.25, 7, [8, 9, [10, 11, 12], 13, 14.75], 15,
> +      16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +-- Error - an attempt to follow JSON path in a scalar field
> +-- during branching.
> +t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
> +---
> +- error: Field ''[4][3][2]'' was not found in the tuple
> +...
>  s:drop()
>  ---
>  ...
> diff --git a/test/box/update.test.lua b/test/box/update.test.lua
> index 60e669d27..d91a2e95c 100644
> --- a/test/box/update.test.lua
> +++ b/test/box/update.test.lua
> @@ -424,4 +424,95 @@ box.commit()
>  vy_s:select()
>  vy_s:drop()
>  
> +--
> +-- Complex updates. Intersected paths, different types.
> +--
> +
> +--
> +-- =
> +--
> +-- Difference in the last part.
> +t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][1][2]', 375}})
> +-- Different starting from the non-last part.
> +t:update({{'=', '[3][3][1][1]', 325}, {'=', '[3][3][2][2]', 475}})
> +-- Contains string keys in the middle.
> +t:update({{'=', '[2].c.f[1]', 4000}, {'=', '[2].c.f[2]', 5000}})
> +t = {1}
> +t[2] = setmetatable({}, {__serialize = 'map'})
> +t[3] = {}
> +t[4] = {                        \
> +    1,                          \
> +    2,                          \
> +    3,                          \
> +    {                           \
> +        4,                      \
> +        5,                      \
> +        6,                      \
> +        7,                      \
> +        {                       \
> +            8,                  \
> +            9,                  \
> +            {10, 11, 12},       \
> +            13,                 \
> +            14,                 \
> +        },                      \
> +        15,                     \
> +        16,                     \
> +        {17, 18, 19}            \
> +    },                          \
> +    20,                         \
> +    {                           \
> +        a = 21,                 \
> +        b = 22,                 \
> +        c = {                   \
> +            d = 23,             \
> +            e = 24              \
> +        }                       \
> +    },                          \
> +    25                          \
> +}
> +t = s:replace(t)
> +-- Non-first and non-last field updates.
> +t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][8][3]', 19000}})
> +-- Triple intersection with enabled fast jump by route.
> +t:update({{'=', '[4][4][3]', 6000}, {'=', '[4][4][5][2]', 9000}, {'=', '[4][4][8][2]', 18000}})
> +-- Triple intersection with no optimizations.
> +t:update({{'=', '[4][3]', 3000}, {'=', '[4][4][2]', 5000}, {'=', '[4][4][5][1]', 8000}})
> +-- Last operation splits previous ones.
> +t:update({{'=', '[4][4][5][3][2]', 11000}, {'=', '[4][4][5][3][1]', 10000}, {'=', '[4][4][3]', 6000}})
> +-- First two operations generate a route '[4]' in the 4th field of
> +-- the tuple. With childs [1] and [2]. The last operation should
> +-- delete the route and replace it with a full featured array.
> +t:update({{'=', '[4][4][1]', 4.5}, {'=', '[4][4][2]', 5.5}, {'=', '[4][3]', 3.5}})
> +-- Test errors.
> +t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][5].a', -1}})
> +t:update({{'=', '[4][4][5][1]', 8000}, {'=', '[4][4][*].a', -1}})
> +t:update({{'=', '[4][4][*].b', 8000}, {'=', '[4][4][*].a', -1}})
> +t:update({{'=', '[4][4][1]', 4000}, {'=', '[4][4]-badjson', -1}})
> +t:update({{'=', '[4][4][1]', 1}, {'=', '[4][4][1]', 2}})
> +-- First two operations produce zero length bar update in field
> +-- [4][4][5][4]. The third operation should raise an error.
> +t:update({{'+', '[4][4][5][4]', 13000}, {'=', '[4][4][5][1]', 8000}, {'+', '[4][4][5][4]', 1}})
> +
> +--
> +-- !
> +--
> +t:update({{'!', '[4][5]', 19.5}, {'!', '[4][6]', 19.75}, {'!', '[4][4][3]', 5.5}, {'=', '[4][4][2]', 5.25}})
> +--
> +-- #
> +--
> +t:update({{'#', '[4][4][5]', 1}, {'#', '[4][4][4]', 1}, {'#', '[4][4][5]', 1}})
> +--
> +-- Scalar operations.
> +--
> +t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}})
> +t:update({{'+', '[4][3]', 0.5}, {'+', '[4][4][5][5]', 0.75}, {'+', '[4][4][3]', 0.25}})
> +-- Combined - check that operations following '!' take into
> +-- account that there is a new element. Indexes should be relative
> +-- to the new array size.
> +t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]', 0.25}})
> +-- Error - an attempt to follow JSON path in a scalar field
> +-- during branching.
> +t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
> +
>  s:drop()
> -- 
> 2.21.0 (Apple Git-122.2)
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps Vladislav Shpilevoy
@ 2019-12-27 14:57   ` Sergey Ostanevich
  2019-12-27 15:52     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 13+ messages in thread
From: Sergey Ostanevich @ 2019-12-27 14:57 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hi!

Thanks for the patch!
Just the spaces after unaries - otherwise LGTM.

If I got it right, the limitation from the #2 patch:

|    This is not allowed yet:
|    
|        [1][2][3].a.b.c = 20
|        [1][2][3].a.e.f = 30
|    
|        First difference is 'b' vs 'e' - intersection by a map,
|        not ok.

is removed. Could you mention this in the description?

Regards,
Sergos

On 23 Dec 23:41, Vladislav Shpilevoy wrote:
> Previous commits introduced isolated JSON updates, and then
> allowed intersection at array. This one completes the puzzle,
> adding intersection at maps. Now JSON updates are fully
> available.
> 
> Closes #1261
> 
> @TarantoolBot document
> Title: JSON updates
> Tuple/space/index:update/upsert now support JSON paths. All the
> same paths as allowed in tuple["..."].
> 
> Example:
> box.cfg{}
> format = {}
> format[1] = {'field1', 'unsigned'}
> format[2] = {'field2', 'map'}
> format[3] = {'field3', 'array'}
> format[4] = {'field4', 'string', is_nullable = true}
> s = box.schema.create_space('test', {format = format})
> _ = s:create_index('pk')
> t = {
>     1,
>     {
>         key1 = 'value',
>         key2 = 10
>     },
>     {
>         2,
>         3,
>         {key3 = 20}
>     }
> }
> t = s:replace(t)
> 
> tarantool> t:update({{'=', 'field2.key1', 'new_value'}})
> ---
> - [1, {'key1': 'new_value', 'key2': 10}, [2, 3, {'key3': 20}]]
> ...
> 
> tarantool> t:update({{'+', 'field3[2]', 1}})
> ---
> - [1, {'key1': 'value', 'key2': 10}, [2, 4, {'key3': 20}]]
> ...
> 
> tarantool> s:update({1}, {{'!', 'field4', 'inserted value'}})
> ---
> - [1, {'key1': 'value', 'key2': 10}, [2, 3, {'key3': 20}], 'inserted value']
> ...
> 
> tarantool> s:update({1}, {{'#', '[2].key2', 1}, {'=', '[3][3].key4', 'value4'}})
> ---
> - [1, {'key1': 'value'}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
> ...
> 
> tarantool> s:upsert({1, {k = 'v'}, {}}, {{'#', '[2].key1', 1}})
> ---
> ...
> 
> tarantool> s:select{}
> ---
> - - [1, {}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
> ...
> 
> Note, that there is the same rule, as in tuple field access by
> JSON, for field names looking like JSON paths. The rule is that
> firstly the whole path is interpreted as a field name. If such a
> name does not exist, then it is treated as a path. For example,
> if there is a field name 'field.name.like.json', then this update
> 
>     <obj>:update({..., 'field.name.like.json', ...})
> 
> will update this field, instead of keys 'field' -> 'name' ->
> 'like' -> 'json'. If such a name is needed as a part of a bigger
> path, then it should be wrapped in quotes and []:
> 
>     <obj>:update({..., '["field.name.like.json"].next.fields', ...})
> 
> There are some new rules for JSON updates:
> 
> - Operation '!' can't be used to create all intermediate nodes of
>   a path. For example, {'!', 'field1[1].field3', ...} can't
>   create fields 'field1' and '[1]', they should exist.
> 
> - Operation '#', when applied to maps, can't delete more than one
>   key at once. That is, its argument should be always 1 for maps.
> 
>     {'#', 'field1.field2', 1} - this is allowed;
>     {'#', 'field1.field2', 10} - this is not.
> 
>   That limitation originates from a problem, that keys in a map
>   are not ordered anyhow, and '#' with more than 1 key would lead
>   to undefined behaviour.
> 
> - Operation '!' on maps can't create a key, if it exists already.
> 
> - If a map contains non-string keys (booleans, numbers, maps,
>   arrays - anything), then these keys can't be updated via JSON
>   paths. But it is still allowed to update string keys in such a
>   map.
> 
> Why JSON updates are good, and should be preferred when only a
> part of a tuple needs to be updated?
> 
> - They consume less space in WAL, because for an update only its
>   keys, operations, and arguments are stored. It is cheaper to
>   store update of one deep field, than the whole tuple.
> 
> - They are faster. Firstly, this is because they are implemented
>   in C, and have no problems with Lua GC and dynamic typing.
>   Secondly, some cases of JSON paths are highly optimized. For
>   example, an update with a single JSON path costs O(1) memory
>   regardless of how deep that path goes (not counting update
>   arguments).
> 
> - They are available from remote clients, as well as any other
>   DML. Before JSON updates to update one deep part of a tuple it
>   would be necessary to download that tuple, update it in memory,
>   send it back - 2 network hops. With JSON paths it can be 1 when
>   the update can be described in paths.
> ---
>  src/box/CMakeLists.txt      |   1 +
>  src/box/xrow_update_field.c |   4 +
>  src/box/xrow_update_field.h |  85 ++++++-
>  src/box/xrow_update_map.c   | 453 ++++++++++++++++++++++++++++++++++++
>  src/box/xrow_update_route.c |  47 +++-
>  test/box/update.result      | 152 ++++++++++++
>  test/box/update.test.lua    |  75 ++++++
>  7 files changed, 813 insertions(+), 4 deletions(-)
>  create mode 100644 src/box/xrow_update_map.c
> 
> diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
> index f99efc503..2022b378b 100644
> --- a/src/box/CMakeLists.txt
> +++ b/src/box/CMakeLists.txt
> @@ -45,6 +45,7 @@ add_library(tuple STATIC
>      xrow_update_array.c
>      xrow_update_bar.c
>      xrow_update_route.c
> +    xrow_update_map.c
>      tuple_compare.cc
>      tuple_extract_key.cc
>      tuple_hash.cc
> diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
> index 3a4aa6c90..a8db31768 100644
> --- a/src/box/xrow_update_field.c
> +++ b/src/box/xrow_update_field.c
> @@ -116,6 +116,8 @@ xrow_update_field_sizeof(struct xrow_update_field *field)
>  		return xrow_update_bar_sizeof(field);
>  	case XUPDATE_ROUTE:
>  		return xrow_update_route_sizeof(field);
> +	case XUPDATE_MAP:
> +		return xrow_update_map_sizeof(field);
>  	default:
>  		unreachable();
>  	}
> @@ -145,6 +147,8 @@ xrow_update_field_store(struct xrow_update_field *field, char *out,
>  		return xrow_update_bar_store(field, out, out_end);
>  	case XUPDATE_ROUTE:
>  		return xrow_update_route_store(field, out, out_end);
> +	case XUPDATE_MAP:
> +		return xrow_update_map_store(field, out, out_end);
>  	default:
>  		unreachable();
>  	}
> diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
> index f07b5a649..6b2a0d999 100644
> --- a/src/box/xrow_update_field.h
> +++ b/src/box/xrow_update_field.h
> @@ -32,6 +32,7 @@
>   */
>  #include "trivia/util.h"
>  #include "tt_static.h"
> +#include "salad/stailq.h"
>  #include "json/json.h"
>  #include "bit/int96.h"
>  #include "mp_decimal.h"
> @@ -254,7 +255,8 @@ enum xrow_update_type {
>  	 * for example, when a rope is split in two parts: not
>  	 * changed left range of fields, and a right range with
>  	 * its first field changed. In this case the left range is
> -	 * NOP.
> +	 * NOP. And when a map is updated and split into rages,
> +	 * when only certain points are not NOP.
>  	 */
>  	XUPDATE_NOP,
>  	/**
> @@ -292,6 +294,11 @@ enum xrow_update_type {
>  	 * '[2] = 30', '[3] = true'.
>  	 */
>  	XUPDATE_ROUTE,
> +	/**
> +	 * Field is an updated map. Check item list for updates of
> +	 * individual fields.
> +	 */
> +	XUPDATE_MAP,
>  };
>  
>  /**
> @@ -383,6 +390,48 @@ struct xrow_update_field {
>  			/** Update subtree. */
>  			struct xrow_update_field *next_hop;
>  		} route;
> +		/**
> +		 * The field is an updated map. Individual fields
> +		 * are stored very similar to array update and its
> +		 * rope nodes. Each item is a key, a value, and a
> +		 * tail of unchanged key-value pairs. The items
> +		 * are stored in a list. But the list is not
> +		 * sorted anyhow by keys, because it does not
> +		 * really make any sense:
> +		 *
> +		 * 1) Keys in MessagePack are not sorted anyway,
> +		 *    and any kind of search would not be possible
> +		 *    even if they were sorted. Sort of a map
> +		 *    would require Nlog(N) time and N memory even
> +		 *    if only a few fields were updated.
> +		 *
> +		 * 2) Double scalar update of the same key is not
> +		 *    possible.
> +		 *
> +		 * Although there is something which can be and is
> +		 * optimized. When a key is updated first time,
> +		 * it is moved to the beginning of the list, and
> +		 * after all operations are done, it is stored
> +		 * in the result tuple before unchanged fields. On
> +		 * a second update of the same tuple it is found
> +		 * immediately.
> +		 */
> +		struct {
> +			/**
> +			 * List of map update items. Each item is
> +			 * a key, a value, and an unchanged tail.
> +			 */
> +			struct stailq items;
> +			/**
> +			 * Number of key-value pairs in the list.
> +			 * Note, it is not a number of items, but
> +			 * exactly number of key-value pairs. It
> +			 * is used to store MessagePack map header
> +			 * without decoding each item again just
> +			 * to learn the size.
> +			 */
> +			int size;
> +		} map;
>  	};
>  };
>  
> @@ -506,6 +555,38 @@ OP_DECL_GENERIC(array)
>  
>  /* }}} xrow_update_field.array */
>  
> +/* {{{ xrow_update_field.map */
> +
> +/**
> + * Initialize @a field as a map to update.
> + * @param[out] field Field to initialize.
> + * @param header Header of the MessagePack map @a data.
> + * @param data MessagePack data of the map to update.
> + * @param data_end End of @a data.
> + * @param field_count Key-value pair count in @data.
> + *
> + * @retval  0 Success.
> + * @retval -1 Error.
> + */
> +int
> +xrow_update_map_create(struct xrow_update_field *field, const char *header,
> +		       const char *data, const char *data_end, int field_count);
> +
> +/**
> + * Create a map update with an existing child. Motivation is
> + * exactly the same as with a similar array constructor. It allows
> + * to avoid unnecessary MessagePack decoding.
> + */
> +int
> +xrow_update_map_create_with_child(struct xrow_update_field *field,
> +				  const char *header,
> +				  const struct xrow_update_field *child,
> +				  const char *key, uint32_t key_len);
> +
> +OP_DECL_GENERIC(map)
> +
> +/* }}} xrow_update_field.map */
> +
>  /* {{{ update_field.bar */
>  
>  OP_DECL_GENERIC(bar)
> @@ -587,6 +668,8 @@ xrow_update_op_do_field_##op_type(struct xrow_update_op *op,			\
>  		return xrow_update_op_do_bar_##op_type(op, field);		\
>  	case XUPDATE_ROUTE:							\
>  		return xrow_update_op_do_route_##op_type(op, field);		\
> +	case XUPDATE_MAP:							\
> +		return xrow_update_op_do_map_##op_type(op, field);		\
>  	default:								\
>  		unreachable();							\
>  	}									\
> diff --git a/src/box/xrow_update_map.c b/src/box/xrow_update_map.c
> new file mode 100644
> index 000000000..22fabff96
> --- /dev/null
> +++ b/src/box/xrow_update_map.c
> @@ -0,0 +1,453 @@
> +/*
> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +#include "xrow_update_field.h"
> +#include "fiber.h"
> +#include "small/region.h"
> +#include "tuple_format.h"
> +
> +/**
> + * Descriptor of one updated key-value pair. Besides updated data
> + * it contains a tail with unchanged pairs, just so as not to
> + * create a separate object for them, and to be similar with array
> + * update items.
> + */
> +struct xrow_update_map_item {
> +	/**
> +	 * Updated key. Can be NULL. In such a case this item
> +	 * contains only an unchanged tail. A key can be nullified
> +	 * if it was removed from the map, or when a map is just
> +	 * created and has no any update yet.
> +	 */
> +	const char *key;
> +	/** Length of @a key. */
> +	uint32_t key_len;
> +	/** Updated value. */
> +	struct xrow_update_field field;
> +	/** Link in the list of updated map keys. */
> +	struct stailq_entry in_items;
> +	/**
> +	 * Size in bytes of unchanged tail data. It goes right
> +	 * after @a field.data.
> +	 */
> +	uint32_t tail_size;
> +};
> +
> +static inline struct xrow_update_map_item *
> +xrow_update_map_item_alloc(void)
> +{
> +	struct xrow_update_map_item *item = (struct xrow_update_map_item *)
> +		region_alloc(&fiber()->gc, sizeof(*item));
> +	if (item == NULL)
> +		diag_set(OutOfMemory, sizeof(*item), "region_alloc", "item");
> +	return item;
> +}
> +
> +static void
> +xrow_update_map_create_item(struct xrow_update_field *field,
> +			    struct xrow_update_map_item *item,
> +			    enum xrow_update_type type, const char *key,
> +			    uint32_t key_len, const char *data,
> +			    uint32_t data_size, uint32_t tail_size)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	item->key = key;
> +	item->key_len = key_len;
> +	item->field.type = type;
> +	item->field.data = data;
> +	item->field.size = data_size,
> +	item->tail_size = tail_size;
> +	/*
> +	 * Each time a new item it created it is stored in the
> +	 * head of update map item list. It helps in case the
> +	 * tuple is regularly updated, because on all next updates
> +	 * this key will be found from the very beginning of the
> +	 * map.
> +	 */
> +	stailq_add_entry(&field->map.items, item, in_items);
> +}
> +
> +static inline struct xrow_update_map_item *
> +xrow_update_map_new_item(struct xrow_update_field *field,
> +			 enum xrow_update_type type, const char *key,
> +			 uint32_t key_len, const char *data, uint32_t data_size,
> +			 uint32_t tail_size)
> +{
> +	struct xrow_update_map_item *item = xrow_update_map_item_alloc();
> +	if (item != NULL) {
> +		xrow_update_map_create_item(field, item, type, key, key_len,
> +					    data, data_size, tail_size);
> +	}
> +	return item;
> +}
> +
> +/**
> + * Find an update item to which @a op should be applied. The
> + * target field may do not exist, but at least its parent should.
> + */
> +static int
> +xrow_update_map_extract_opt_item(struct xrow_update_field *field,
> +				 struct xrow_update_op *op,
> +				 struct xrow_update_map_item **res)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	if (op->token_type == JSON_TOKEN_END) {
> +		if (xrow_update_op_consume_token(op) != 0)
> +			return -1;
> +		if (op->token_type != JSON_TOKEN_STR) {
> +			return xrow_update_err(op, "can't update a map by not "\
> +					       "a string key");
> +		}
> +	}
> +	struct stailq *items = &field->map.items;
> +	struct xrow_update_map_item *i, *new_item;
> +	/*
> +	 * Firstly, try to find the key among already updated
> +	 * ones. Perhaps, the needed node is just an intermediate
> +	 * key of a bigger JSON path, and there are many updates
> +	 * passing this key, so it should be here for all except
> +	 * first updates.
> +	 */
> +	stailq_foreach_entry(i, items, in_items) {
> +		if (i->key != NULL && i->key_len == op->key_len &&
> +		    memcmp(i->key, op->key, i->key_len) == 0) {
> +			*res = i;
> +			return 0;
> +		}
> +	}
> +	/*
> +	 * Slow path - the key is updated first time, need to
> +	 * decode tails.
> +	 */
> +	uint32_t key_len, i_tail_size;
> +	const char *pos, *key, *end, *tmp, *begin;
> +	stailq_foreach_entry(i, items, in_items) {
> +		begin = i->field.data + i->field.size;
> +		pos = begin;
> +		end = begin + i->tail_size;
> +		i_tail_size = 0;
> +		while(pos < end) {
> +			if (mp_typeof(*pos) != MP_STR) {
> +				mp_next(&pos);
> +				mp_next(&pos);
> +				continue;
> +			}
> +			i_tail_size = pos - begin;
> +			key = mp_decode_str(&pos, &key_len);
> +			if (key_len == op->key_len &&
> +			    memcmp(key, op->key, key_len) == 0)
> +				goto key_is_found;
> +			mp_next(&pos);
> +		}
> +	}
> +	*res = NULL;
> +	return 0;
> +
> +key_is_found:
> +	tmp = pos;
> +	mp_next(&tmp);
> +	if (i_tail_size == 0 && i->key == NULL) {
> +		/*
> +		 * Looks like the needed key was found from the
> +		 * beginning of tail of an item without a key.
> +		 * This is actually good, because this item can
> +		 * be transformed instead of a new item creation.
> +		 */
> +		i->key = op->key;
> +		i->key_len = op->key_len;
> +		i->field.data = pos;
> +		i->field.size = tmp - pos;
> +		i->tail_size = end - tmp;
> +		new_item = i;
> +	} else {
> +		new_item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
> +						    op->key_len, pos, tmp - pos,
> +						    end - tmp);
> +		if (new_item == NULL)
> +			return -1;
> +		i->tail_size = i_tail_size;
> +	}
> +	*res = new_item;
> +	return 0;
> +}
> +
> +/**
> + * The same as optional extractor, but the field to update should
> + * exist. It is the case of all scalar operations (except '=' - it
> + * can work as insert).
> + */
> +static inline struct xrow_update_map_item *
> +xrow_update_map_extract_item(struct xrow_update_field *field,
> +			     struct xrow_update_op *op)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	struct xrow_update_map_item *res;
> +	if (xrow_update_map_extract_opt_item(field, op, &res) != 0)
> +		return NULL;
> +	if (res == NULL)
> +		xrow_update_err_no_such_field(op);
> +	return res;
> +}
> +
> +int
> +xrow_update_op_do_map_insert(struct xrow_update_op *op,
> +			     struct xrow_update_field *field)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	struct xrow_update_map_item *item;
> +	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
> +		return -1;
> +	if (! xrow_update_op_is_term(op)) {
> +		if (item == NULL)
> +			return xrow_update_err_no_such_field(op);
> +		op->token_type = JSON_TOKEN_END;
> +		return xrow_update_op_do_field_insert(op, &item->field);
> +	}
> +	if (item != NULL)
> +		return xrow_update_err_duplicate(op);
> +	++field->map.size;
> +	item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
> +					op->key_len, op->arg.set.value,
> +					op->arg.set.length, 0);
> +	return item != NULL ? 0 : -1;
> +}
> +
> +int
> +xrow_update_op_do_map_set(struct xrow_update_op *op,
> +			  struct xrow_update_field *field)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	struct xrow_update_map_item *item;
> +	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
> +		return -1;
> +	if (! xrow_update_op_is_term(op)) {
> +		if (item == NULL)
> +			return xrow_update_err_no_such_field(op);
> +		op->token_type = JSON_TOKEN_END;
> +		return xrow_update_op_do_field_set(op, &item->field);
> +	}
> +	if (item != NULL) {
> +		op->new_field_len = op->arg.set.length;
> +		/* Ignore the previous op, if any. */
> +		item->field.type = XUPDATE_SCALAR;
> +		item->field.scalar.op = op;
> +		return 0;
> +	}
> +	++field->map.size;
> +	item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
> +					op->key_len, op->arg.set.value,
> +					op->arg.set.length, 0);
> +	return item != NULL ? 0 : -1;
> +}
> +
> +int
> +xrow_update_op_do_map_delete(struct xrow_update_op *op,
> +			     struct xrow_update_field *field)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	struct xrow_update_map_item *item;
> +	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
> +		return -1;
> +	if (! xrow_update_op_is_term(op)) {
> +		if (item == NULL)
> +			return xrow_update_err_no_such_field(op);
> +		op->token_type = JSON_TOKEN_END;
> +		return xrow_update_op_do_field_delete(op, &item->field);
> +	}
> +	if (op->arg.del.count != 1)
> +		return xrow_update_err_delete1(op);
> +	if (item == NULL)
> +		return 0;
> +	/*
> +	 * Note, even if tail size becomes 0, this item is not
> +	 * deleted. This is because items are linked via stailq,
> +	 * elements of which can't be deleted as simple as that.
> +	 * But it is not a big deal, because '#' is a really rare
> +	 * operation.
> +	 * Why just a next key from the tail can't be decoded?
> +	 * Why key should be NULL here? This is because the JSON
> +	 * updates allow to update a map containing non-string
> +	 * keys. If the next key is not a string, it can't be
> +	 * saved as key of the item. Therefore, it is better not
> +	 * to touch unchanged tails unless a new update operation
> +	 * needs it.
> +	 */
> +	item->key = NULL;
> +	item->key_len = 0;
> +	item->field.data += item->field.size;
> +	item->field.size = 0;
> +	item->field.type = XUPDATE_NOP;
> +	--field->map.size;
> +	return 0;
> +}
> +
> +#define DO_SCALAR_OP_GENERIC(op_type)						\
> +int										\
> +xrow_update_op_do_map_##op_type(struct xrow_update_op *op,			\
> +				struct xrow_update_field *field)		\
> +{										\
> +	assert(field->type == XUPDATE_MAP);					\
> +	struct xrow_update_map_item *item =					\
> +		xrow_update_map_extract_item(field, op);			\
> +	if (item == NULL)							\
> +		return -1;							\
> +	if (! xrow_update_op_is_term(op)) {					\
> +		op->token_type = JSON_TOKEN_END;				\
> +		return xrow_update_op_do_field_##op_type(op, &item->field);	\
> +	}									\
> +	if (item->field.type != XUPDATE_NOP)					\
> +		return xrow_update_err_double(op);				\
> +	if (xrow_update_op_do_##op_type(op, item->field.data) != 0)		\
> +		return -1;							\
> +	item->field.type = XUPDATE_SCALAR;					\
> +	item->field.scalar.op = op;						\
> +	return 0;								\
> +}
> +
> +DO_SCALAR_OP_GENERIC(arith)
> +
> +DO_SCALAR_OP_GENERIC(bit)
> +
> +DO_SCALAR_OP_GENERIC(splice)
> +
> +int
> +xrow_update_map_create(struct xrow_update_field *field, const char *header,
> +		       const char *data, const char *data_end, int field_count)
> +{
> +	field->type = XUPDATE_MAP;
> +	field->data = header;
> +	field->size = data_end - header;
> +	field->map.size = field_count;
> +	stailq_create(&field->map.items);
> +	if (field_count == 0)
> +		return 0;
> +	struct xrow_update_map_item *first =
> +		xrow_update_map_new_item(field, XUPDATE_NOP, NULL, 0, data, 0,
> +					 data_end - data);
> +	return first != NULL ? 0 : -1;
> +}
> +
> +int
> +xrow_update_map_create_with_child(struct xrow_update_field *field,
> +				  const char *header,
> +				  const struct xrow_update_field *child,
> +				  const char *key, uint32_t key_len)
> +{
> +	field->type = XUPDATE_MAP;
> +	field->data = header;
> +	stailq_create(&field->map.items);
> +
> +	const char *pos = header;
> +	uint32_t i, ikey_len, field_count = mp_decode_map(&pos);
> +	const char *begin = pos;
> +	struct xrow_update_map_item *item = NULL;
> +	for (i = 0; i < field_count; ++i) {
> +		if (mp_typeof(*pos) != MP_STR) {
> +			mp_next(&pos);
> +			mp_next(&pos);
> +			continue;
> +		}
> +		const char *before_key = pos;
> +		const char *ikey = mp_decode_str(&pos, &ikey_len);
> +		if (ikey_len == key_len && memcmp(ikey, key, key_len) == 0) {
> +			item = xrow_update_map_new_item(field, XUPDATE_NOP,
> +							NULL, 0, begin, 0,
> +							before_key - begin);
> +			if (item == NULL)
> +				return -1;
> +			++i;
> +			break;
> +		}
> +		mp_next(&pos);
> +	}
> +	/*
> +	 * When a map is initialized with an existing child, it
> +	 * means, that it was already found earlier. I.e. it is
> +	 * impossible to miss it.
> +	 */
> +	assert(item != NULL);
> +	const char *data = pos;
> +	mp_next(&pos);
> +	uint32_t data_size = pos - data;
> +	for (; i < field_count; ++i) {
> +		mp_next(&pos);
> +		mp_next(&pos);
> +	}
> +	item = xrow_update_map_item_alloc();
> +	if (item == NULL)
> +		return -1;
> +	item->field = *child;
> +	xrow_update_map_create_item(field, item, child->type, key, key_len,
> +				    data, data_size, pos - data - data_size);
> +	field->map.size = field_count;
> +	field->size = pos - header;
> +	return 0;
> +}
> +
> +uint32_t
> +xrow_update_map_sizeof(struct xrow_update_field *field)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	uint32_t res = mp_sizeof_map(field->map.size);
> +	struct xrow_update_map_item *i;
> +	stailq_foreach_entry(i, &field->map.items, in_items) {
> +		res += i->tail_size;
> +		if (i->key != NULL) {
> +			res += mp_sizeof_str(i->key_len) +
> +			       xrow_update_field_sizeof(&i->field);
> +		}
> +	}
> +	return res;
> +}
> +
> +uint32_t
> +xrow_update_map_store(struct xrow_update_field *field, char *out, char *out_end)
> +{
> +	assert(field->type == XUPDATE_MAP);
> +	char *out_begin = out;
> +	out = mp_encode_map(out, field->map.size);
> +	struct xrow_update_map_item *i;
> +	/*
> +	 * This is the trick about saving updated keys before
> +	 * others. The first cycle doesn't save unchanged tails.
> +	 */
> +	stailq_foreach_entry(i, &field->map.items, in_items) {
> +		if (i->key != NULL) {
> +			out = mp_encode_str(out, i->key, i->key_len);
> +			out += xrow_update_field_store(&i->field, out, out_end);
> +		}
> +	}
> +	stailq_foreach_entry(i, &field->map.items, in_items) {
> +		memcpy(out, i->field.data + i->field.size, i->tail_size);
> +		out += i->tail_size;
> +	}
> +	assert(out <= out_end);
> +	return out - out_begin;
> +}
> diff --git a/src/box/xrow_update_route.c b/src/box/xrow_update_route.c
> index f374db837..b76ebb36b 100644
> --- a/src/box/xrow_update_route.c
> +++ b/src/box/xrow_update_route.c
> @@ -123,6 +123,38 @@ xrow_update_route_branch_array(struct xrow_update_field *next_hop,
>  	return op->meta->do_op(op, next_hop);
>  }
>  
> +/**
> + * Do the actual branch, but by a map and a key in that map. Works
> + * exactly the same as the array-counterpart.
> + */
> +static int
> +xrow_update_route_branch_map(struct xrow_update_field *next_hop,
> +			     const char *parent,
> +			     const struct xrow_update_field *child,
> +			     const char *key, int key_len)
> +{
> +	struct xrow_update_op *op = child->bar.op;
> +	if (child->type != XUPDATE_BAR || child->bar.path_len > 0 ||
> +	    (op->opcode != '!' && op->opcode != '#')) {
> +		return xrow_update_map_create_with_child(next_hop, parent,
> +							 child, key, key_len);
> +	}
> +	op->token_type = JSON_TOKEN_STR;
> +	op->key = key;
> +	op->key_len = key_len;
> +	const char *data = parent;
> +	uint32_t field_count = mp_decode_map(&data);
> +	const char *end = data;
> +	for (uint32_t i = 0; i < field_count; ++i) {
> +		mp_next(&end);
> +		mp_next(&end);
> +	}
> +	if (xrow_update_map_create(next_hop, parent, data, end,
> +				   field_count) != 0)
> +		return -1;
> +	return op->meta->do_op(op, next_hop);
> +}
> +
>  struct xrow_update_field *
>  xrow_update_route_branch(struct xrow_update_field *field,
>  			 struct xrow_update_op *new_op)
> @@ -256,9 +288,18 @@ xrow_update_route_branch(struct xrow_update_field *field,
>  						   old_token.num) != 0)
>  			return NULL;
>  	} else if (type == MP_MAP) {
> -		diag_set(ClientError, ER_UNSUPPORTED, "update",
> -			 "path intersection on map");
> -		return NULL;
> +		if (new_token.type != JSON_TOKEN_STR) {
> +			xrow_update_err(new_op, "can not update map by "\
> +					"non-string key");
> +			return NULL;
> +		}
> +		new_op->token_type = JSON_TOKEN_STR;
> +		new_op->key = new_token.str;
> +		new_op->key_len = new_token.len;
> +		if (xrow_update_route_branch_map(next_hop, parent, &child,
> +						 old_token.str,
> +						 old_token.len) != 0)
> +			return NULL;
>  	} else {
>  		xrow_update_err_no_such_field(new_op);
>  		return NULL;
> diff --git a/test/box/update.result b/test/box/update.result
> index bf76f6638..4f9349bce 100644
> --- a/test/box/update.result
> +++ b/test/box/update.result
> @@ -1471,6 +1471,158 @@ t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
>  ---
>  - error: Field ''[4][3][2]'' was not found in the tuple
>  ...
> +--
> +-- Intersecting map updates.
> +--
> +t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
> +---
> +- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [17, 18,
> +        19]], 20, {'b': 22, 'a': 21, 'c': {'d': 2300, 'e': 2400}}, 25]]
> +...
> +-- Fill an empty map.
> +t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
> +---
> +- [1, {'k': 100, 'v': 200}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14],
> +      15, 16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +t = {}                                  \
> +t[1] = 1                                \
> +t[2] = {                                \
> +    a = 1,                              \
> +    b = 2,                              \
> +    c = {3, 4, 5},                      \
> +    d = {                               \
> +        [0] = 0,                        \
> +        e = 6,                          \
> +        [true] = false,                 \
> +        f = 7,                          \
> +        [-1] = -1,                      \
> +    },                                  \
> +    g = {                               \
> +        {k = 8, v = 9},                 \
> +        [box.NULL] = box.NULL,          \
> +        {k = 10, v = 11},               \
> +        {k = '12str', v = '13str'},     \
> +    },                                  \
> +    h = {                               \
> +        [-1] = {{{{-1}, -1}, -1}, -1},  \
> +        i = {                           \
> +            j = 14,                     \
> +            k = 15,                     \
> +        },                              \
> +        m = 16,                         \
> +    }                                   \
> +}
> +---
> +...
> +t[3] = {}
> +---
> +...
> +t = box.tuple.new(t)
> +---
> +...
> +-- Insert and delete from the same map at once.
> +t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
> +---
> +- [1, {'a': 1, 'n': 17, 'd': {0: 0, -1: -1, true: false, 'f': 7, 'e': 6}, 'h': {'i': {
> +        'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3, 4, 5], 'g': {
> +      1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
> +      null: null}}, []]
> +...
> +-- Delete a not existing key.
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
> +---
> +- [1, {'b': 2, 'a': 1, 'd': {0: 0, 'f': 7, 'e': 6, -1: -1, true: false, 'g': 6000},
> +    'h': {'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3,
> +      4, 5], 'g': {1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
> +      null: null}}, []]
> +...
> +-- Scalar operations.
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
> +---
> +- [1, {'b': 3, 'c': [3, 4, 5], 'd': {0: 0, true: false, 'f': 8, -1: -1, 'e': 7}, 'h': {
> +      'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'a': 1, 'g': {
> +      1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
> +      null: null}}, []]
> +...
> +-- Errors.
> +-- This operation array creates an update tree with several full
> +-- featured maps, not bars. It is required to cover some complex
> +-- cases about fields extraction.
> +ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
> +---
> +...
> +-- When a map update extracts a next token on demand, it should
> +-- reject bad json, obviously. Scalar and '!'/'#' operations are
> +-- implemented separately and tested both.
> +ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
> +...
> +ops[#ops][1] = '!'
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
> +...
> +ops[#ops][1] = '#'
> +---
> +...
> +ops[#ops][3] = 1
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
> +...
> +-- Key extractor should reject any attempt to use non-string keys.
> +ops[#ops] = {'-', '[2].h.i[20]', -1}
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
> +    key'
> +...
> +ops[#ops][1] = '='
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
> +    key'
> +...
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
> +---
> +- error: Field ''[2].d.k'' was not found in the tuple
> +...
> +t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
> +---
> +- error: Field ''[2].d.g.h'' was not found in the tuple
> +...
> +t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
> +---
> +- error: 'Field ''[2].d.e'' UPDATE error: the key exists already'
> +...
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
> +---
> +- error: 'Field ''[2].d.old'' UPDATE error: can delete only 1 field from a map in
> +    a row'
> +...
> +-- '!'/'=' can be used to create a field, but only if it is a
> +-- tail. These operations can't create the whole path.
> +t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
> +---
> +- error: Field ''[2].d.new.new'' was not found in the tuple
> +...
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
> +---
> +- error: Field ''[2].d.old.old'' was not found in the tuple
> +...
>  s:drop()
>  ---
>  ...
> diff --git a/test/box/update.test.lua b/test/box/update.test.lua
> index d91a2e95c..6fed818f4 100644
> --- a/test/box/update.test.lua
> +++ b/test/box/update.test.lua
> @@ -515,4 +515,79 @@ t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]',
>  -- during branching.
>  t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
>  
> +--
> +-- Intersecting map updates.
> +--
> +t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
> +-- Fill an empty map.
> +t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
> +
> +t = {}                                  \
> +t[1] = 1                                \
> +t[2] = {                                \
> +    a = 1,                              \
> +    b = 2,                              \
> +    c = {3, 4, 5},                      \
> +    d = {                               \
> +        [0] = 0,                        \
> +        e = 6,                          \
> +        [true] = false,                 \
> +        f = 7,                          \
> +        [-1] = -1,                      \
> +    },                                  \
> +    g = {                               \
> +        {k = 8, v = 9},                 \
> +        [box.NULL] = box.NULL,          \
> +        {k = 10, v = 11},               \
> +        {k = '12str', v = '13str'},     \
> +    },                                  \
> +    h = {                               \
> +        [-1] = {{{{-1}, -1}, -1}, -1},  \
> +        i = {                           \
> +            j = 14,                     \
> +            k = 15,                     \
> +        },                              \
> +        m = 16,                         \
> +    }                                   \
> +}
> +t[3] = {}
> +t = box.tuple.new(t)
> +-- Insert and delete from the same map at once.
> +t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
> +-- Delete a not existing key.
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
> +
> +-- Scalar operations.
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
> +
> +-- Errors.
> +-- This operation array creates an update tree with several full
> +-- featured maps, not bars. It is required to cover some complex
> +-- cases about fields extraction.
> +ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
> +-- When a map update extracts a next token on demand, it should
> +-- reject bad json, obviously. Scalar and '!'/'#' operations are
> +-- implemented separately and tested both.
> +ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
> +t:update(ops)
> +ops[#ops][1] = '!'
> +t:update(ops)
> +ops[#ops][1] = '#'
> +ops[#ops][3] = 1
> +t:update(ops)
> +-- Key extractor should reject any attempt to use non-string keys.
> +ops[#ops] = {'-', '[2].h.i[20]', -1}
> +t:update(ops)
> +ops[#ops][1] = '='
> +t:update(ops)
> +
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
> +t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
> +t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
> +-- '!'/'=' can be used to create a field, but only if it is a
> +-- tail. These operations can't create the whole path.
> +t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
> +
>  s:drop()
> -- 
> 2.21.0 (Apple Git-122.2)
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable
  2019-12-27 13:00     ` Vladislav Shpilevoy
@ 2019-12-27 14:59       ` Sergey Ostanevich
  0 siblings, 0 replies; 13+ messages in thread
From: Sergey Ostanevich @ 2019-12-27 14:59 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hi!

> So I added a flag to xrow_update_op.

Thanks, this is something that I had in my mind. 
LGTM.

Sergos.

On 27 Dec 16:00, Vladislav Shpilevoy wrote:
> Hi! Thanks for the review!
> 
> On 26/12/2019 13:07, Sergey Ostanevich wrote:
> > Hi!
> > 
> > Thanks for the patch! 
> > Just one comment below, otherwise LGTM.
> > 
> > Sergos
> > 
> > On 23 Dec 23:41, Vladislav Shpilevoy wrote:
> >> There is a case: [1][2][3][4] = 100. It is not a problem when it
> >> is a single operation, not intersecting with anything. It is an
> >> isolated update then, and works ok. But the next patch allows
> >> several update operations have the same prefix, and the path
> >> [1][2][3][4] can become a tree of updated arrays. For example, a
> >> trivial tree like this:
> >>
> >>     root: [ [1] ]
> >>              |
> >>              [ [1] [2] ]
> >>                     |
> >>                     [ [1] [2] [3] ]
> >>                                |
> >>                                [ [1] [2] [3] [4] ]
> >>                                              =100
> >>
> >> When the update is applied to root, the JSON path [1][2][3][4]
> >> is decoded one part by one. And the operation goes down the tree
> >> until reaches the leaf, where [4] = 100 is applied. Each time when
> >> the update goes one level down, somebody should update
> >> xrow_update_op.field_no so as on the first level it would be 1,
> >> then 2, 3, 4.
> >>
> >> Does it mean that each level of the update [1][2][3][4] should
> >> prepare field_no for the next child? No, because then they would
> >> need to check type of the child if it is an array or map, or
> >> whatever expects a valid field_no/key in xrow_update_op, and
> >> ensure that map-child gets a key, array-child gets a field_no.
> >> That would complicate the code to a totally unreadable
> >> state, and would break encapsulation between
> >> xrow_update_array/map/bar... . Each array update operation would
> >> check a child for all existing types to ensure that the next token
> >> matches it. The same would happen to map updates.
> >>
> >> This patch goes another way - let each level of update check if
> >> its field_no/key is already prepared by the caller. And if not,
> >> extract a next token from the operation path. So the map update
> >> will ensure that it has a valid key, an array update will ensure
> >> that it has a valid field no.
> >>
> >> Part of #1261
> >> ---
> >>  src/box/xrow_update_array.c | 47 +++++++++++++++++++++++++++++++++++--
> >>  src/box/xrow_update_field.c | 22 +++++++++++++++++
> >>  src/box/xrow_update_field.h | 25 ++++++++++++++++++--
> >>  3 files changed, 90 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
> >> index 57427e39c..6a7ce09ff 100644
> >> --- a/src/box/xrow_update_array.c
> >> +++ b/src/box/xrow_update_array.c
> >> @@ -32,6 +32,31 @@
> >>  #include "msgpuck.h"
> >>  #include "fiber.h"
> >>  
> >> +/**
> >> + * Make sure @a op contains a valid field number to where the
> >> + * operation should be applied next. Field number may be not
> >> + * known, if the array's parent didn't propagate operation's
> >> + * lexer. In fact, the parent fills fieldno only in some rare
> >> + * cases like branching. Generally, an array should care about
> >> + * fieldno by itself.
> >> + */
> >> +static inline int
> >> +xrow_update_op_prepare_num_token(struct xrow_update_op *op)
> >> +{
> >> +	/*
> >> +	 * Token type END is a special value meaning that the
> >> +	 * current token needs to be parsed.
> >> +	 */
> > 
> > I really don't like reuse of entities introduced for some purpose. If
> > one will read this code, then it will be hard to understand why out of
> > nowhere Lexer forced to come to an end at
> > xrow_update_op_do_array_insert() for example. More that that - if lexer
> > in a state of 'reached the end' it forced to parse. 
> > 
> > If we introduce some more functionality for JSON later - I expect we
> > will face even more logic to hang on this 'special value'. Then
> > contexts of these functionalities may intersect... oh my...
> > 
> > Can we introduce a new type with some self-explanatory name instead and
> > leave the JSON_TOKEN_END on its own? 
> 
> Agree. But I don't know how to justify introduction of a new token
> type to the json library. I was thinking about something like
> JSON_TOKEN_INVALID, but 1) json library reports an error via return
> values, not via token types, 2) 'invalid' token assumes an error,
> not an already parsed token, like I need in my patch.
> 
> A value like JSON_TOKEN_CONSUMED/USED looks too specific for my case
> to be added to the core lib.
> 
> So I added a flag to xrow_update_op. See diff below, and the whole
> patch in the end.
> 
> ================================================================================
> 
> diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
> index 2eb98e8e9..e6b9ed89f 100644
> --- a/src/box/xrow_update_array.c
> +++ b/src/box/xrow_update_array.c
> @@ -43,13 +43,8 @@
>  static inline int
>  xrow_update_op_prepare_num_token(struct xrow_update_op *op)
>  {
> -	/*
> -	 * Token type END is a special value meaning that the
> -	 * current token needs to be parsed.
> -	 */
> -	if (op->token_type == JSON_TOKEN_END &&
> -	    xrow_update_op_consume_token(op) != 0)
> -			return -1;
> +	if (op->is_token_consumed && xrow_update_op_next_token(op) != 0)
> +		return -1;
>  	if (op->token_type != JSON_TOKEN_NUM) {
>  		return xrow_update_err(op, "can't update an array by a "\
>  				       "non-numeric index");
> @@ -64,7 +59,7 @@ xrow_update_op_prepare_num_token(struct xrow_update_op *op)
>  static inline int
>  xrow_update_op_adjust_field_no(struct xrow_update_op *op, int32_t field_count)
>  {
> -	assert(op->token_type == JSON_TOKEN_NUM);
> +	assert(op->token_type == JSON_TOKEN_NUM && !op->is_token_consumed);
>  	if (op->field_no >= 0) {
>  		if (op->field_no < field_count)
>  			return 0;
> @@ -254,7 +249,7 @@ xrow_update_op_do_array_insert(struct xrow_update_op *op,
>  		item = xrow_update_array_extract_item(field, op);
>  		if (item == NULL)
>  			return -1;
> -		op->token_type = JSON_TOKEN_END;
> +		op->is_token_consumed = true;
>  		return xrow_update_op_do_field_insert(op, &item->field);
>  	}
>  
> @@ -290,7 +285,7 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
>  	if (item == NULL)
>  		return -1;
>  	if (!xrow_update_op_is_term(op)) {
> -		op->token_type = JSON_TOKEN_END;
> +		op->is_token_consumed = true;
>  		return xrow_update_op_do_field_set(op, &item->field);
>  	}
>  	op->new_field_len = op->arg.set.length;
> @@ -318,7 +313,7 @@ xrow_update_op_do_array_delete(struct xrow_update_op *op,
>  			xrow_update_array_extract_item(field, op);
>  		if (item == NULL)
>  			return -1;
> -		op->token_type = JSON_TOKEN_END;
> +		op->is_token_consumed = true;
>  		return xrow_update_op_do_field_delete(op, &item->field);
>  	}
>  
> @@ -347,7 +342,7 @@ xrow_update_op_do_array_##op_type(struct xrow_update_op *op,			\
>  	if (item == NULL)							\
>  		return -1;							\
>  	if (!xrow_update_op_is_term(op)) {					\
> -		op->token_type = JSON_TOKEN_END;				\
> +		op->is_token_consumed = true;					\
>  		return xrow_update_op_do_field_##op_type(op, &item->field);	\
>  	}									\
>  	if (item->field.type != XUPDATE_NOP)					\
> diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
> index 96fcaf747..460672b4c 100644
> --- a/src/box/xrow_update_field.c
> +++ b/src/box/xrow_update_field.c
> @@ -612,7 +612,7 @@ xrow_update_op_by(char opcode)
>  }
>  
>  int
> -xrow_update_op_consume_token(struct xrow_update_op *op)
> +xrow_update_op_next_token(struct xrow_update_op *op)
>  {
>  	struct json_token token;
>  	int rc = json_lexer_next_token(&op->lexer, &token);
> @@ -620,6 +620,7 @@ xrow_update_op_consume_token(struct xrow_update_op *op)
>  		return xrow_update_err_bad_json(op, rc);
>  	if (token.type == JSON_TOKEN_END)
>  		return xrow_update_err_no_such_field(op);
> +	op->is_token_consumed = false;
>  	op->token_type = token.type;
>  	op->key = token.str;
>  	op->key_len = token.len;
> @@ -661,6 +662,7 @@ xrow_update_op_decode(struct xrow_update_op *op, int index_base,
>  	 * dictionary.
>  	 */
>  	op->token_type = JSON_TOKEN_NUM;
> +	op->is_token_consumed = false;
>  	int32_t field_no = 0;
>  	switch(mp_typeof(**expr)) {
>  	case MP_INT:
> diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
> index fbaf45c5d..0b0f608fe 100644
> --- a/src/box/xrow_update_field.h
> +++ b/src/box/xrow_update_field.h
> @@ -179,11 +179,16 @@ struct xrow_update_op {
>  	const struct xrow_update_op_meta *meta;
>  	/** Operation arguments. */
>  	union xrow_update_arg arg;
> +	/** Current level token. */
> +	enum json_token_type token_type;
>  	/**
> -	 * Current level token. END means that it is invalid and
> -	 * a next token should be extracted from the lexer.
> +	 * The flag says whether the token is already consumed by
> +	 * the update operation during its forwarding down the
> +	 * update tree. When the flag is true, it means that the
> +	 * next node of the update tree will need to fetch a next
> +	 * token from the lexer.
>  	 */
> -	enum json_token_type token_type;
> +	bool is_token_consumed;
>  	union {
>  		struct {
>  			const char *key;
> @@ -212,7 +217,7 @@ struct xrow_update_op {
>   * map/array.
>   */
>  int
> -xrow_update_op_consume_token(struct xrow_update_op *op);
> +xrow_update_op_next_token(struct xrow_update_op *op);
>  
>  /**
>   * Decode an update operation from MessagePack.
> 
> ================================================================================
> 
> The whole patch:
> 
> ================================================================================
> 
>     tuple: make update operation tokens consumable
>     
>     There is a case: [1][2][3][4] = 100. It is not a problem when it
>     is a single operation, not intersecting with anything. It is an
>     isolated update then, and works ok. But the next patch allows
>     several update operations have the same prefix, and the path
>     [1][2][3][4] can become a tree of updated arrays. For example, a
>     trivial tree like this:
>     
>         root: [ [1] ]
>                  |
>                  [ [1] [2] ]
>                         |
>                         [ [1] [2] [3] ]
>                                    |
>                                    [ [1] [2] [3] [4] ]
>                                                  =100
>     
>     When the update is applied to root, the JSON path [1][2][3][4]
>     is decoded one part by one. And the operation goes down the tree
>     until reaches the leaf, where [4] = 100 is applied. Each time when
>     the update goes one level down, somebody should update
>     xrow_update_op.field_no so as on the first level it would be 1,
>     then 2, 3, 4.
>     
>     Does it mean that each level of the update [1][2][3][4] should
>     prepare field_no for the next child? No, because then they would
>     need to check type of the child if it is an array or map, or
>     whatever expects a valid field_no/key in xrow_update_op, and
>     ensure that map-child gets a key, array-child gets a field_no.
>     That would complicate the code to a totally unreadable
>     state, and would break encapsulation between
>     xrow_update_array/map/bar... . Each array update operation would
>     check a child for all existing types to ensure that the next token
>     matches it. The same would happen to map updates.
>     
>     This patch goes another way - let each level of update check if
>     its field_no/key is already prepared by the caller. And if not,
>     extract a next token from the operation path. So the map update
>     will ensure that it has a valid key, an array update will ensure
>     that it has a valid field no.
>     
>     Part of #1261
> 
> diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
> index 57427e39c..e6b9ed89f 100644
> --- a/src/box/xrow_update_array.c
> +++ b/src/box/xrow_update_array.c
> @@ -32,6 +32,26 @@
>  #include "msgpuck.h"
>  #include "fiber.h"
>  
> +/**
> + * Make sure @a op contains a valid field number to where the
> + * operation should be applied next. Field number may be not
> + * known, if the array's parent didn't propagate operation's
> + * lexer. In fact, the parent fills fieldno only in some rare
> + * cases like branching. Generally, an array should care about
> + * fieldno by itself.
> + */
> +static inline int
> +xrow_update_op_prepare_num_token(struct xrow_update_op *op)
> +{
> +	if (op->is_token_consumed && xrow_update_op_next_token(op) != 0)
> +		return -1;
> +	if (op->token_type != JSON_TOKEN_NUM) {
> +		return xrow_update_err(op, "can't update an array by a "\
> +				       "non-numeric index");
> +	}
> +	return 0;
> +}
> +
>  /**
>   * Make field index non-negative and check for the field
>   * existence.
> @@ -39,6 +59,7 @@
>  static inline int
>  xrow_update_op_adjust_field_no(struct xrow_update_op *op, int32_t field_count)
>  {
> +	assert(op->token_type == JSON_TOKEN_NUM && !op->is_token_consumed);
>  	if (op->field_no >= 0) {
>  		if (op->field_no < field_count)
>  			return 0;
> @@ -221,10 +242,14 @@ xrow_update_op_do_array_insert(struct xrow_update_op *op,
>  {
>  	assert(field->type == XUPDATE_ARRAY);
>  	struct xrow_update_array_item *item;
> +	if (xrow_update_op_prepare_num_token(op) != 0)
> +		return -1;
> +
>  	if (!xrow_update_op_is_term(op)) {
>  		item = xrow_update_array_extract_item(field, op);
>  		if (item == NULL)
>  			return -1;
> +		op->is_token_consumed = true;
>  		return xrow_update_op_do_field_insert(op, &item->field);
>  	}
>  
> @@ -248,6 +273,9 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
>  {
>  	assert(field->type == XUPDATE_ARRAY);
>  	struct xrow_update_rope *rope = field->array.rope;
> +	if (xrow_update_op_prepare_num_token(op) != 0)
> +		return -1;
> +
>  	/* Interpret '=' for n + 1 field as insert. */
>  	if (op->field_no == (int32_t) xrow_update_rope_size(rope))
>  		return xrow_update_op_do_array_insert(op, field);
> @@ -256,8 +284,10 @@ xrow_update_op_do_array_set(struct xrow_update_op *op,
>  		xrow_update_array_extract_item(field, op);
>  	if (item == NULL)
>  		return -1;
> -	if (!xrow_update_op_is_term(op))
> +	if (!xrow_update_op_is_term(op)) {
> +		op->is_token_consumed = true;
>  		return xrow_update_op_do_field_set(op, &item->field);
> +	}
>  	op->new_field_len = op->arg.set.length;
>  	/*
>  	 * Ignore the previous op, if any. It is not correct,
> @@ -275,11 +305,15 @@ xrow_update_op_do_array_delete(struct xrow_update_op *op,
>  			       struct xrow_update_field *field)
>  {
>  	assert(field->type == XUPDATE_ARRAY);
> +	if (xrow_update_op_prepare_num_token(op) != 0)
> +		return -1;
> +
>  	if (!xrow_update_op_is_term(op)) {
>  		struct xrow_update_array_item *item =
>  			xrow_update_array_extract_item(field, op);
>  		if (item == NULL)
>  			return -1;
> +		op->is_token_consumed = true;
>  		return xrow_update_op_do_field_delete(op, &item->field);
>  	}
>  
> @@ -301,12 +335,16 @@ int										\
>  xrow_update_op_do_array_##op_type(struct xrow_update_op *op,			\
>  				  struct xrow_update_field *field)		\
>  {										\
> +	if (xrow_update_op_prepare_num_token(op) != 0)				\
> +		return -1;							\
>  	struct xrow_update_array_item *item =					\
>  		xrow_update_array_extract_item(field, op);			\
>  	if (item == NULL)							\
>  		return -1;							\
> -	if (!xrow_update_op_is_term(op))					\
> +	if (!xrow_update_op_is_term(op)) {					\
> +		op->is_token_consumed = true;					\
>  		return xrow_update_op_do_field_##op_type(op, &item->field);	\
> +	}									\
>  	if (item->field.type != XUPDATE_NOP)					\
>  		return xrow_update_err_double(op);				\
>  	if (xrow_update_op_do_##op_type(op, item->field.data) != 0)		\
> diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
> index de865a21d..460672b4c 100644
> --- a/src/box/xrow_update_field.c
> +++ b/src/box/xrow_update_field.c
> @@ -611,6 +611,23 @@ xrow_update_op_by(char opcode)
>  	}
>  }
>  
> +int
> +xrow_update_op_next_token(struct xrow_update_op *op)
> +{
> +	struct json_token token;
> +	int rc = json_lexer_next_token(&op->lexer, &token);
> +	if (rc != 0)
> +		return xrow_update_err_bad_json(op, rc);
> +	if (token.type == JSON_TOKEN_END)
> +		return xrow_update_err_no_such_field(op);
> +	op->is_token_consumed = false;
> +	op->token_type = token.type;
> +	op->key = token.str;
> +	op->key_len = token.len;
> +	op->field_no = token.num;
> +	return 0;
> +}
> +
>  int
>  xrow_update_op_decode(struct xrow_update_op *op, int index_base,
>  		      struct tuple_dictionary *dict, const char **expr)
> @@ -639,6 +656,13 @@ xrow_update_op_decode(struct xrow_update_op *op, int index_base,
>  		diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
>  		return -1;
>  	}
> +	/*
> +	 * First token is always num. Even if a user specified a
> +	 * field name it is converted to num by the tuple
> +	 * dictionary.
> +	 */
> +	op->token_type = JSON_TOKEN_NUM;
> +	op->is_token_consumed = false;
>  	int32_t field_no = 0;
>  	switch(mp_typeof(**expr)) {
>  	case MP_INT:
> diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
> index bda9222cc..0b0f608fe 100644
> --- a/src/box/xrow_update_field.h
> +++ b/src/box/xrow_update_field.h
> @@ -179,8 +179,23 @@ struct xrow_update_op {
>  	const struct xrow_update_op_meta *meta;
>  	/** Operation arguments. */
>  	union xrow_update_arg arg;
> -	/** First level field no. */
> -	int32_t field_no;
> +	/** Current level token. */
> +	enum json_token_type token_type;
> +	/**
> +	 * The flag says whether the token is already consumed by
> +	 * the update operation during its forwarding down the
> +	 * update tree. When the flag is true, it means that the
> +	 * next node of the update tree will need to fetch a next
> +	 * token from the lexer.
> +	 */
> +	bool is_token_consumed;
> +	union {
> +		struct {
> +			const char *key;
> +			uint32_t key_len;
> +		};
> +		int32_t field_no;
> +	};
>  	/** Size of a new field after it is updated. */
>  	uint32_t new_field_len;
>  	/** Opcode symbol: = + - / ... */
> @@ -193,6 +208,17 @@ struct xrow_update_op {
>  	struct json_lexer lexer;
>  };
>  
> +/**
> + * Extract a next token from the operation path lexer. The result
> + * is used to decide to which child of a current map/array the
> + * operation should be forwarded. It is not just a synonym to
> + * json_lexer_next_token, because fills some fields of @a op,
> + * and should be used only to chose a next child inside a current
> + * map/array.
> + */
> +int
> +xrow_update_op_next_token(struct xrow_update_op *op);
> +
>  /**
>   * Decode an update operation from MessagePack.
>   * @param[out] op Update operation.
> 
> ================================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps
  2019-12-27 14:57   ` Sergey Ostanevich
@ 2019-12-27 15:52     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-27 15:52 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

Thanks for the review!

On 27/12/2019 15:57, Sergey Ostanevich wrote:
> Hi!
> 
> Thanks for the patch!
> Just the spaces after unaries - otherwise LGTM.

Fixed:

================================================================================
diff --git a/src/box/xrow_update_map.c b/src/box/xrow_update_map.c
index 65dcb7408..b4251cc3b 100644
--- a/src/box/xrow_update_map.c
+++ b/src/box/xrow_update_map.c
@@ -224,7 +224,7 @@ xrow_update_op_do_map_insert(struct xrow_update_op *op,
 	struct xrow_update_map_item *item;
 	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
 		return -1;
-	if (! xrow_update_op_is_term(op)) {
+	if (!xrow_update_op_is_term(op)) {
 		if (item == NULL)
 			return xrow_update_err_no_such_field(op);
 		op->is_token_consumed = true;
@@ -247,7 +247,7 @@ xrow_update_op_do_map_set(struct xrow_update_op *op,
 	struct xrow_update_map_item *item;
 	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
 		return -1;
-	if (! xrow_update_op_is_term(op)) {
+	if (!xrow_update_op_is_term(op)) {
 		if (item == NULL)
 			return xrow_update_err_no_such_field(op);
 		op->is_token_consumed = true;
@@ -275,7 +275,7 @@ xrow_update_op_do_map_delete(struct xrow_update_op *op,
 	struct xrow_update_map_item *item;
 	if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
 		return -1;
-	if (! xrow_update_op_is_term(op)) {
+	if (!xrow_update_op_is_term(op)) {
 		if (item == NULL)
 			return xrow_update_err_no_such_field(op);
 		op->is_token_consumed = true;
@@ -318,7 +318,7 @@ xrow_update_op_do_map_##op_type(struct xrow_update_op *op,			\
 		xrow_update_map_extract_item(field, op);			\
 	if (item == NULL)							\
 		return -1;							\
-	if (! xrow_update_op_is_term(op)) {					\
+	if (!xrow_update_op_is_term(op)) {					\
 		op->is_token_consumed = true;					\
 		return xrow_update_op_do_field_##op_type(op, &item->field);	\
 	}									\
================================================================================

> 
> If I got it right, the limitation from the #2 patch:
> 
> |    This is not allowed yet:
> |    
> |        [1][2][3].a.b.c = 20
> |        [1][2][3].a.e.f = 30
> |    
> |        First difference is 'b' vs 'e' - intersection by a map,
> |        not ok.
> 
> is removed. Could you mention this in the description?
> 

Sure. New commit message (prior to doc request):

================================================================================
    tuple: JSON path update intersection at maps
    
    Previous commits introduced isolated JSON updates, and then
    allowed intersection at array. This one completes the puzzle,
    adding intersection at maps, so now both these samples work:
    
    Allowed in the previous commit:
    
        [1][2][3].a.b.c = 20
        [1][2][4].e.f.g = 30
               ^
    
        First difference is [3] vs [4] - intersection by an array.
    
    Allowed in this commit:
    
        [1][2][3].a.b.c = 20
        [1][2][3].a.e.f = 30
                    ^
    
        First difference is 'b' vs 'e' - intersection by a map.
    
    Now JSON updates are fully available.
    
    Closes #1261

================================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] tuple: JSON path update intersection at arrays
  2019-12-27 14:13   ` Sergey Ostanevich
@ 2019-12-27 15:52     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 13+ messages in thread
From: Vladislav Shpilevoy @ 2019-12-27 15:52 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: tarantool-patches

Thanks for the review!

>> diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
>> index fbaf45c5d..f07b5a649 100644
>> --- a/src/box/xrow_update_field.h
>> +++ b/src/box/xrow_update_field.h
>> @@ -447,6 +475,33 @@ xrow_update_array_create(struct xrow_update_field *field, const char *header,
>>  			 const char *data, const char *data_end,
>>  			 uint32_t field_count);
>>  
>> +/**
>> + * The same as mere create, but the array is created with a first
>> + * child right away. It allows to make it more efficient, because
>> + * under the hood a rope is created not as a one huge range which
>> + * is then spit in parts, but as two rope nodes from the
>               ^^^^
> Should be     split here?
> 

Yes, it is a typo.

================================================================================
  * under the hood a rope is created not as a one huge range which
- * is then spit in parts, but as two rope nodes from the
+ * is then split in parts, but as two rope nodes from the
  * beginning. On the summary: -1 rope node split, -1 decoding of
================================================================================

>> + * beginning. On the summary: -1 rope node split, -1 decoding of
>> + * fields from 0 to @a field_no.
>> + *
>> + * The function is used during branching, where there was an
>> + * existing update, but another one came with the same prefix, and
>> + * a different suffix.
>> + *
>> + * @param[out] field Field to initialize.
>> + * @param header Header of the MessagePack array.
>> + * @param child A child subtree. The child is copied by value into
>> + *        the created array.
>> + * @param field_no Field number of @a child.
>> + *
>> + * @retval  0 Success.
>> + * @retval -1 Error.
>> + */
>> +int
>> +xrow_update_array_create_with_child(struct xrow_update_field *field,
>> +				    const char *header,
>> +				    const struct xrow_update_field *child,
>> +				    int32_t field_no);
>> +
>> diff --git a/src/box/xrow_update_route.c b/src/box/xrow_update_route.c
>> new file mode 100644
>> index 000000000..f374db837
>> --- /dev/null
>> +++ b/src/box/xrow_update_route.c
>> @@ -0,0 +1,346 @@
>> +/**
>> + * Do the actual branch. This is the case when an existing
>> + * bar/route path is different from a new operation's path in an
>> + * array. The existing object needs to be split into parent-child,
>> + * and the new operation becomes a second child.
>> + *
>> + * @param[out] next_hop A field which will be initialized as an
>> + *        array, and which will be a point to apply a new
>> + *        operation.
>> + * @param parent MessagePack array to be taken by @a next_hop.
>> + * @param child Current field from which the branch happens. It
>> + *        already contains an update subtree.
>> + */
>> +static int
>> +xrow_update_route_branch_array(struct xrow_update_field *next_hop,
>> +			       const char *parent,
>> +			       const struct xrow_update_field *child,
>> +			       int32_t field_no)
>> +{
>> +	struct xrow_update_op *op = child->bar.op;
>> +	/*
>> +	 * There are rules when a subtree can be just copied as is
> 
> I would name these as limitations, not rules. 
> 

Ok, fixed:

================================================================================
 	struct xrow_update_op *op = child->bar.op;
 	/*
-	 * There are rules when a subtree can be just copied as is
-	 * from one parent to another.
+	 * There are limitations when a subtree can be just copied
+	 * as is from one parent to another.
 	 * 1) It should not be a bar update. Because if it is not
================================================================================

>> +	 * from one parent to another.
>> +	 * 1) It should not be a bar update. Because if it is not
>> +	 *    a bar, then it is either scalar or an array/map.
>> +	 *    Scalar update can be safely moved. Array/map update
>> +	 *    doesn't change their parent, and also can be moved.
>> +	 *    Otherwise see (2).
>> +	 * 2) It is a bar. Then it should not be a leaf. If it is
>> +	 *    not a leaf, then it does not change header and other
>> +	 *    fields of this particular array, and can be safely
>> +	 *    moved to somewhere else.
>> +	 *    Otherwise see (1).
>                             ^^^
>                             (3) I suppose?

Yup.

================================================================================
 	 *    moved to somewhere else.
-	 *    Otherwise see (1).
+	 *    Otherwise see (3).
 	 * 3) Ok, it is a bar, a leaf. Then it is a bar with zero
================================================================================

>> +	 * 3) Ok, it is a bar, a leaf. Then it is a bar with zero
>> +	 *    path length. It could degrade to zero path len in
>> +	 *    during branching. In this case it should be a scalar
>> +	 *    bar. The only non-scalar operations are '!' and '#'.
>> +	 *
>> +	 * Why '#' and '!' can't be moved? '!', for example, being
>> +	 * applied to a field [1], affects all fields [2-*], and
>> +	 * the array header. The same but in an even worse form
>> +	 * about '#'. Such operations should be redone. They
>> +	 * affect many fields and the parent.
>> +	 *
>> +	 * There is a tricky thing though - why not to just redo
>> +	 * all operations here, for the sake of code simplicity?
>> +	 * It would allow to remove 'create_with_child' crutch.
>> +	 * The answer is - it is not possible. If a field is
>> +	 * copyable, it is not re-applicable. And vice-versa. For
>> +	 * example, if it is not a leaf, then there may be many
>> +	 * operations, not one. A subtree just can't be
>> +	 * 're-applied'.
>> +	 *
>> +	 * If the operation is scalar and a leaf, then its result
>> +	 * has already overridden its arguments. This is because
>> +	 * scalar operations save result into the arguments, to
>> +	 * save memory. A second operation appliance would lead
>> +	 * to very surprising results.
>> +	 *
>> +	 * Another reason - performance. This path should be
>> +	 * quite hot, and to copy a struct is for sure much faster
>> +	 * than to reapply an operation using a virtual function.
>> +	 * Operations '!' and '#' are quite rare, so their
>> +	 * optimization is not a critical goal.
>> +	 */
>> +	if (/* (1) Not a bar. */
>> +	    child->type != XUPDATE_BAR ||
>> +	    /* (2) Bar, but not a leaf. */
>> +	    child->bar.path_len > 0 ||
>> +	    /* (3) Leaf, bar, but a scalar operation. */
>> +	    (op->opcode != '!' && op->opcode != '#')) {
>> +		return xrow_update_array_create_with_child(next_hop, parent,
>> +							   child, field_no);
>> +	}
>> +	/*
>> +	 * Can't move the child. The only way to branch is to
>> +	 * reapply the child's operation.
>> +	 */
>> +	op->token_type = JSON_TOKEN_NUM;
>> +	op->field_no = field_no;
>> +	const char *data = parent;
>> +	uint32_t field_count = mp_decode_array(&data);
>> +	const char *end = data;
>> +	for (uint32_t i = 0; i < field_count; ++i)
>> +		mp_next(&end);
>> +	if (xrow_update_array_create(next_hop, parent, data, end,
>> +				     field_count) != 0)
>> +		return -1;
>> +	return op->meta->do_op(op, next_hop);
>> +}
>> +
>> +struct xrow_update_field *
>> +xrow_update_route_branch(struct xrow_update_field *field,
>> +			 struct xrow_update_op *new_op)
>> +{
>> +	assert(new_op->lexer.src != NULL);
>> +	const char *old_path;
>> +	int old_path_len;
>> +	if (field->type == XUPDATE_BAR) {
>> +		old_path = field->bar.path;
>> +		old_path_len = field->bar.path_len;
>> +	} else {
>> +		assert(field->type == XUPDATE_ROUTE);
>> +		old_path = field->route.path;
>> +		old_path_len = field->route.path_len;
>> +	}
>> +	assert(old_path != NULL);
>> +	struct json_lexer old_path_lexer;
>> +	struct json_token old_token, new_token;
>> +	/*
>> +	 * Offset is going to be used as a length of the route
>> +	 * node created as a parent of the old subtree and the
>> +	 * new operation. As it is described in the route
>> +	 * definition in struct xrow_update_field - route is the
>> +	 * common prefix of all operations of the subtree. Here
>> +	 * length of that prefix is calculated.
>> +	 * In addition, it is used here to determine if the new
>> +	 * operation is different from the current subtree from
>> +	 * the very beginning. Basically, it means the offset is
>> +	 * 0, and no route is generated. Root becomes a regular
>> +	 * update field (array, map), not a route. Example:
>> +	 * @a field is a bar '[1].a.b = 20', @a new_op is
>> +	 * '[2].c.d = 30'. In this case the paths are already
>> +	 * different from the beginning, no common prefix = no
>> +	 * route. Array with children [1].a.b and [2].c.d becomes
>> +	 * a root.
>> +	 */
>> +	int saved_old_offset;
>> +	json_lexer_create(&old_path_lexer, old_path, old_path_len,
>> +			  TUPLE_INDEX_BASE);
>> +	const char *parent = field->data;
>> +	do {
>> +		saved_old_offset = old_path_lexer.offset;
>> +		int rc = json_lexer_next_token(&old_path_lexer, &old_token);
>> +		/* Old path is already validated. */
>> +		assert(rc == 0);
>> +		rc = json_lexer_next_token(&new_op->lexer, &new_token);
>> +		if (rc != 0) {
>> +			xrow_update_err_bad_json(new_op, rc);
>> +			return NULL;
>> +		}
>> +		if (json_token_cmp(&old_token, &new_token) != 0)
>> +			break;
>> +		switch(new_token.type) {
>> +		case JSON_TOKEN_NUM:
>> +			rc = tuple_field_go_to_index(&parent, new_token.num);
>> +			break;
>> +		case JSON_TOKEN_STR:
>> +			rc = tuple_field_go_to_key(&parent, new_token.str,
>> +						   new_token.len);
>> +			break;
>> +		default:
>> +			/*
>> +			 * Can't be JSON_TOKEN_ANY, because old
>> +			 * and new tokens are equal, but '*' is
>> +			 * considered invalid and the old was
>> +			 * already checked for that. So the new is
>> +			 * valid too. And can't have type ANY.
>> +			 */
>> +			assert(new_token.type == JSON_TOKEN_END);
>> +			xrow_update_err_double(new_op);
>> +			return NULL;
>> +		}
>> +		/*
>> +		 * Must always find a field, because the old
>> +		 * token already went that path.
>> +		 */
>> +		assert(rc == 0);
>> +	} while (true);
>> +	enum mp_type type = mp_typeof(*parent);
>> +	/*
>> +	 * Ched if the paths are different from the very
>            ^^^^
>            check(ed)?

Sure, typo.

================================================================================
 	/*
-	 * Ched if the paths are different from the very
+	 * Check if the paths are different from the very
 	 * beginning. It means, that the old field should be
================================================================================

>> +	 * beginning. It means, that the old field should be
>> +	 * transformed instead of creating a new route node.
>> +	 */
>> +	bool transform_root = saved_old_offset == 0;
> 
> It would be nicer if you put comparison in () up there
> 

Ok, I don't mind.

================================================================================
 	 */
-	bool transform_root = saved_old_offset == 0;
+	bool transform_root = (saved_old_offset == 0);
 	struct xrow_update_field *next_hop;
================================================================================

>> +	struct xrow_update_field *next_hop;
>> +	if (! transform_root) {
> 
> No spaces after unaries, use sed -ie "s/(! /\(\!/"
> 

True.

================================================================================
diff --git a/src/box/xrow_update_route.c b/src/box/xrow_update_route.c
index d03028c7b..7e07e695b 100644
--- a/src/box/xrow_update_route.c
+++ b/src/box/xrow_update_route.c
@@ -210,7 +210,7 @@ xrow_update_route_branch(struct xrow_update_field *field,
 	 */
 	bool transform_root = (saved_old_offset == 0);
 	struct xrow_update_field *next_hop;
-	if (! transform_root) {
+	if (!transform_root) {
 		next_hop = (struct xrow_update_field *)
 			region_alloc(&fiber()->gc, sizeof(*next_hop));
 		if (next_hop == NULL) {
@@ -266,7 +266,7 @@ xrow_update_route_branch(struct xrow_update_field *field,
 		return NULL;
 	}
 
-	if (! transform_root) {
+	if (!transform_root) {
 		field->type = XUPDATE_ROUTE;
 		field->route.path = old_path;
 		field->route.path_len = saved_old_offset;
@@ -286,7 +286,7 @@ static struct xrow_update_field *
 xrow_update_route_next(struct xrow_update_field *field, struct xrow_update_op *op)
 {
 	assert(field->type == XUPDATE_ROUTE);
-	assert(! xrow_update_op_is_term(op));
+	assert(!xrow_update_op_is_term(op));
 	const char *new_path = op->lexer.src + op->lexer.offset;
 	int new_path_len = op->lexer.src_len - op->lexer.offset;
 	if (field->route.path_len <= new_path_len &&
================================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/3] JSON update
  2019-12-23 22:41 [Tarantool-patches] [PATCH 0/3] JSON update Vladislav Shpilevoy
                   ` (2 preceding siblings ...)
  2019-12-23 22:41 ` [Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps Vladislav Shpilevoy
@ 2019-12-30  5:26 ` Kirill Yukhin
  3 siblings, 0 replies; 13+ messages in thread
From: Kirill Yukhin @ 2019-12-30  5:26 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hello,

On 23 дек 23:41, Vladislav Shpilevoy wrote:
> The patchset finishes JSON update feature by allowance of making
> multiple update operations with the same prefix in one update().
> 
> The patchset consists of relatively independent self-explaining
> parts. First part is a preparation for having not just a set of
> updates, but for having an update tree, with number and string
> keys.
> 
> Second part makes it possible to do multiple update operations
> having the same prefix if it ends on an array index.
> 
> The final part introduces map updates, documentation request, and
> finishes the feature.
> 
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-1261-json-update
> Issue: https://github.com/tarantool/tarantool/issues/1261

I've checked the patch set into master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-12-30  5:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-23 22:41 [Tarantool-patches] [PATCH 0/3] JSON update Vladislav Shpilevoy
2019-12-23 22:41 ` [Tarantool-patches] [PATCH 1/3] tuple: make update operation tokens consumable Vladislav Shpilevoy
2019-12-24 16:15   ` Vladislav Shpilevoy
2019-12-26 12:07   ` Sergey Ostanevich
2019-12-27 13:00     ` Vladislav Shpilevoy
2019-12-27 14:59       ` Sergey Ostanevich
2019-12-23 22:41 ` [Tarantool-patches] [PATCH 2/3] tuple: JSON path update intersection at arrays Vladislav Shpilevoy
2019-12-27 14:13   ` Sergey Ostanevich
2019-12-27 15:52     ` Vladislav Shpilevoy
2019-12-23 22:41 ` [Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps Vladislav Shpilevoy
2019-12-27 14:57   ` Sergey Ostanevich
2019-12-27 15:52     ` Vladislav Shpilevoy
2019-12-30  5:26 ` [Tarantool-patches] [PATCH 0/3] JSON update Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox