Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@freelists.org
Cc: kostja@tarantool.org
Subject: [tarantool-patches] [PATCH v2 8/8] tuple: JSON updates support intersection by maps
Date: Sat, 31 Aug 2019 23:35:58 +0200	[thread overview]
Message-ID: <067da13b80758c16a6eb905968dac4fa1e952c83.1567287197.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1567287197.git.v.shpilevoy@tarantool.org>

Previous commits introduced JSON updates with any non-intersecting
JSON paths, and allowed intersection by arrays. This one completes
the puzzle, now any JSON updates are available.

Closes #1261

@TarantoolBot document
Title: JSON updates
Tuple/space/index:update/upsert now support JSON paths. All the
same paths as allowed in tuple["..."]. Including field names in
the first JSON path token.

Example:
box.cfg{}
format = {}
format[1] = {'field1', 'unsigned'}
format[2] = {'field2', 'map'}
format[3] = {'field3', 'array'}
format[4] = {'field4', 'string', is_nullable = true}
s = box.schema.create_space('test', {format = format})
_ = s:create_index('pk')
t = {
    1,
    {
        key1 = 'value',
        key2 = 10
    },
    {
        2,
        3,
        {key3 = 20}
    }
}
t = s:replace(t)

tarantool> t:update({{'=', 'field2.key1', 'new_value'}})
---
- [1, {'key1': 'new_value', 'key2': 10}, [2, 3, {'key3': 20}]]
...

tarantool> t:update({{'+', 'field3[2]', 1}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 4, {'key3': 20}]]
...

tarantool> s:update({1}, {{'!', 'field4', 'inserted value'}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 3, {'key3': 20}], 'inserted value']
...

tarantool> s:update({1}, {{'#', '[2].key2', 1}, {'=', '[3][3].key4', 'value4'}})
---
- [1, {'key1': 'value'}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...

tarantool> s:upsert({1, {k = 'v'}, {}}, {{'#', '[2].key1', 1}})
---
...

tarantool> s:select{}
---
- - [1, {}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...

Note, that there is the same rule, as in tuple field access by
JSON, for field names looking like JSON paths. The rule is that
firstly the whole path is interpreted as a field name. If such a
name does not exist, then it is treated as a path. For example,
if there is a field name 'field.name.like.json', then this update

    <obj>:update({..., 'field.name.like.json', ...})

will update this field, instead of keys 'field' -> 'name' ->
'like' -> 'json'. If such a name is needed as a part of a bigger
path, then it should be wrapped in quotes and []:

    <obj>:update({..., '["field.name.like.json"].next.fields', ...})

There are some new rules for JSON updates:

- Operation '!' can't be used to create all intermediate nodes of
  a path. For example, {'!', 'field1[1].field3', ...} can't
  create fields 'field1' and '[1]', they should exist.

- Operation '#', when applied to maps, can't delete more than one
  key at once. That is, its argument should be always 1 for maps.

    {'#', 'field1.field2', 1} - this is allowed;
    {'#', 'field1.field2', 10} - this is not.

  That limitation originates from a problem, that keys in a map
  are not ordered anyhow, and '#' with more than 1 key would lead
  to undefined behaviour.

- Operation '!' on maps can't create a key, if it exists already.

- If a map contains non-string keys (booleans, numbers, maps,
  arrays - anything), then these keys can't be updated via JSON
  paths. But it is still allowed to update string keys in such a
  map.

Why JSON updates are good, and should be preferred when only a
part of a tuple needs to be updated?

- They consume less space in WAL, because for an update only its
  keys, operations, and arguments are stored. It is cheaper to
  store update of one deep field, than the whole tuple subtree.

- They are faster. Firstly, this is because they are implemented
  in C, and have no problems with Lua GC and dynamic typing.
  Secondly, some cases of JSON paths are highly optimized. For
  example, an update with a single JSON path costs O(1) memory
  regardless of how deep that path goes (not counting update
  arguments).

- They are available from remote clients, as well as any other
  DML. Before JSON updates to update one deep part of a tuple it
  would be necessary to download that tuple, update it in memory,
  send it back - 2 network hops. With JSON paths it can be 1 when
  the update can be described in paths.
---
 src/box/CMakeLists.txt        |   1 +
 src/box/update/update_field.c |   4 +
 src/box/update/update_field.h |  84 ++++++-
 src/box/update/update_map.c   | 441 ++++++++++++++++++++++++++++++++++
 src/box/update/update_route.c |  74 ++++--
 test/box/update.result        | 152 ++++++++++++
 test/box/update.test.lua      |  75 ++++++
 7 files changed, 813 insertions(+), 18 deletions(-)
 create mode 100644 src/box/update/update_map.c

diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 67461bd9b..a68231189 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -45,6 +45,7 @@ add_library(tuple STATIC
     update/update_array.c
     update/update_bar.c
     update/update_route.c
+    update/update_map.c
     tuple_compare.cc
     tuple_extract_key.cc
     tuple_hash.cc
diff --git a/src/box/update/update_field.c b/src/box/update/update_field.c
index 161dd5875..fdc1a2512 100644
--- a/src/box/update/update_field.c
+++ b/src/box/update/update_field.c
@@ -118,6 +118,8 @@ update_field_sizeof(struct update_field *field)
 		return update_bar_sizeof(field);
 	case UPDATE_ROUTE:
 		return update_route_sizeof(field);
+	case UPDATE_MAP:
+		return update_map_sizeof(field);
 	default:
 		unreachable();
 	}
@@ -146,6 +148,8 @@ update_field_store(struct update_field *field, char *out, char *out_end)
 		return update_bar_store(field, out, out_end);
 	case UPDATE_ROUTE:
 		return update_route_store(field, out, out_end);
+	case UPDATE_MAP:
+		return update_map_store(field, out, out_end);
 	default:
 		unreachable();
 	}
diff --git a/src/box/update/update_field.h b/src/box/update/update_field.h
index 81c010615..c3a9b6d97 100644
--- a/src/box/update/update_field.h
+++ b/src/box/update/update_field.h
@@ -33,6 +33,7 @@
 #include "trivia/util.h"
 #include "tt_static.h"
 #include <stdint.h>
+#include "salad/stailq.h"
 #include "json/json.h"
 #include "bit/int96.h"
 #include "mp_decimal.h"
@@ -245,7 +246,8 @@ enum update_type {
 	 * for example, when a rope is split in two parts: not
 	 * changed left range of fields, and a right range with
 	 * its first field changed. In this case the left range is
-	 * NOP.
+	 * NOP. And when a map is updated and split into rages,
+	 * when only certain points are not NOP.
 	 */
 	UPDATE_NOP,
 	/**
@@ -282,6 +284,11 @@ enum update_type {
 	 * with a subtree 'key1 = 20', 'key2 = 30', 'key3 = true'.
 	 */
 	UPDATE_ROUTE,
+	/**
+	 * Field is an updated map. Check item list for updates of
+	 * individual fields.
+	 */
+	UPDATE_MAP,
 };
 
 /**
@@ -370,6 +377,48 @@ struct update_field {
 			/** Update subtree. */
 			struct update_field *next_hop;
 		} route;
+		/**
+		 * The field is an updated map. Individual fields
+		 * are stored very similar to array update and its
+		 * rope nodes. Each item is a key, a value, and a
+		 * tail of unchanged key-value pairs. The items
+		 * are stored in a list. But the list is not
+		 * sorted anyhow by keys, because it does not
+		 * really make any sense:
+		 *
+		 * 1) Keys in MessagePack are not sorted anyway,
+		 *    and any kind of search would not be possible
+		 *    even if they were sorted. Sort of a map
+		 *    would require Nlog(N) time and N memory even
+		 *    if only a few fields were updated.
+		 *
+		 * 2) Double scalar update of the same key is not
+		 *    possible.
+		 *
+		 * Although there is something which can be and is
+		 * optimized. When a key is updated first time,
+		 * it is moved to the beginning of the list, and
+		 * after all operations are done, it is stored
+		 * in the result tuple before unchanged fields. On
+		 * a second update of the same tuple it is found
+		 * immediately.
+		 */
+		struct {
+			/**
+			 * List of map update items. Each item is
+			 * a key, a value, and an unchanged tail.
+			 */
+			struct stailq items;
+			/**
+			 * Number of key-value pairs in the list.
+			 * Note, it is not a number of items, but
+			 * exactly number of key-value pairs. It
+			 * is used to store MessagePack map header
+			 * without decoding each item again just
+			 * to learn the size.
+			 */
+			int size;
+		} map;
 	};
 };
 
@@ -464,6 +513,37 @@ OP_DECL_GENERIC(array)
 
 /* }}} update_field.array */
 
+/* {{{ update_field.map */
+
+/**
+ * Initialize @a field as a map to update.
+ * @param[out] field Field to initialize.
+ * @param header Header of the MessagePack map @a data.
+ * @param data MessagePack data of the map to update.
+ * @param data_end End of @a data.
+ * @param field_count Key-value pair count in @data.
+ *
+ * @retval  0 Success.
+ * @retval -1 Error.
+ */
+int
+update_map_create(struct update_field *field, const char *header,
+		  const char *data, const char *data_end, int field_count);
+
+/**
+ * Create a map update with an existing child. Motivation is
+ * exactly the same as with a similar array constructor. It allows
+ * to avoid unnecessary MessagePack decoding.
+ */
+int
+update_map_create_with_child(struct update_field *field,
+			     const struct update_field *child, const char *key,
+			     uint32_t key_len, const char *header);
+
+OP_DECL_GENERIC(map)
+
+/* }}} update_field.map */
+
 /* {{{ update_field.bar */
 
 OP_DECL_GENERIC(bar)
@@ -540,6 +620,8 @@ do_op_##op_type(struct update_op *op, struct update_field *field)		\
 		return do_op_bar_##op_type(op, field);				\
 	case UPDATE_ROUTE:							\
 		return do_op_route_##op_type(op, field);			\
+	case UPDATE_MAP:							\
+		return do_op_map_##op_type(op, field);				\
 	default:								\
 		unreachable();							\
 	}									\
diff --git a/src/box/update/update_map.c b/src/box/update/update_map.c
new file mode 100644
index 000000000..14838404e
--- /dev/null
+++ b/src/box/update/update_map.c
@@ -0,0 +1,441 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "update_field.h"
+#include "fiber.h"
+#include "small/region.h"
+#include "box/tuple_format.h"
+
+/**
+ * Descriptor of one updated key-value pair. Besides updated data
+ * it contains a tail with unchanged pairs, just so as not to
+ * create a separate object for them, and to be similar with array
+ * update items.
+ */
+struct update_map_item {
+	/**
+	 * Updated key. Can be NULL. In such a case this item
+	 * contains only an unchanged tail. A key can be nullified
+	 * if it was removed from the map, or when a map is just
+	 * created and has no any update yet.
+	 */
+	const char *key;
+	/** Length of @a key. */
+	uint32_t key_len;
+	/** Updated value. */
+	struct update_field field;
+	/** Link in the list of updated map keys. */
+	struct stailq_entry in_items;
+	/**
+	 * Size in bytes of unchanged tail data. It goes right
+	 * after @a field.data.
+	 */
+	uint32_t tail_size;
+};
+
+static inline struct update_map_item *
+update_map_item_alloc(void)
+{
+	struct update_map_item *item = (struct update_map_item *)
+		region_alloc(&fiber()->gc, sizeof(*item));
+	if (item == NULL)
+		diag_set(OutOfMemory, sizeof(*item), "region_alloc", "item");
+	return item;
+}
+
+static void
+update_map_create_item(struct update_field *field, struct update_map_item *item,
+		       enum update_type type, const char *key, uint32_t key_len,
+		       const char *data, uint32_t data_size, uint32_t tail_size)
+{
+	assert(field->type == UPDATE_MAP);
+	item->key = key;
+	item->key_len = key_len;
+	item->field.type = type;
+	item->field.data = data;
+	item->field.size = data_size,
+	item->tail_size = tail_size;
+	/*
+	 * Each time a new item it created it is stored in the
+	 * head of update map item list. It helps in case the
+	 * tuple is regularly updated, because on all next updates
+	 * this key will be found from the very beginning of the
+	 * map.
+	 */
+	stailq_add_entry(&field->map.items, item, in_items);
+}
+
+static inline struct update_map_item *
+update_map_new_item(struct update_field *field, enum update_type type,
+		    const char *key, uint32_t key_len, const char *data,
+		    uint32_t data_size, uint32_t tail_size)
+{
+	struct update_map_item *item = update_map_item_alloc();
+	if (item != NULL) {
+		update_map_create_item(field, item, type, key, key_len, data,
+				       data_size, tail_size);
+	}
+	return item;
+}
+
+/**
+ * Find an update item to which @a op should be applied. The
+ * target field may do not exist, but at least its parent should.
+ */
+static int
+update_map_extract_opt_item(struct update_field *field, struct update_op *op,
+			    struct update_map_item **res)
+{
+	assert(field->type == UPDATE_MAP);
+	if (op->token_type == JSON_TOKEN_END) {
+		if (update_op_consume_token(op) != 0)
+			return -1;
+		if (op->token_type != JSON_TOKEN_STR) {
+			return update_err(op, "can't update a map by not a "\
+					  "string key");
+		}
+	}
+	struct stailq *items = &field->map.items;
+	struct update_map_item *i, *new_item;
+	/*
+	 * Firstly, try to find the key among already updated
+	 * ones. Perhaps, the needed node is just an intermediate
+	 * key of a bigger JSON path, and there are many updates
+	 * passing this key, so it should be here for all except
+	 * first updates.
+	 */
+	stailq_foreach_entry(i, items, in_items) {
+		if (i->key != NULL && i->key_len == op->key_len &&
+		    memcmp(i->key, op->key, i->key_len) == 0) {
+			*res = i;
+			return 0;
+		}
+	}
+	/*
+	 * Slow path - the key is updated first time, need to
+	 * decode tails.
+	 */
+	uint32_t key_len, i_tail_size;
+	const char *pos, *key, *end, *tmp, *begin;
+	stailq_foreach_entry(i, items, in_items) {
+		begin = i->field.data + i->field.size;
+		pos = begin;
+		end = begin + i->tail_size;
+		i_tail_size = 0;
+		while(pos < end) {
+			if (mp_typeof(*pos) != MP_STR) {
+				mp_next(&pos);
+				mp_next(&pos);
+				continue;
+			}
+			i_tail_size = pos - begin;
+			key = mp_decode_str(&pos, &key_len);
+			if (key_len == op->key_len &&
+			    memcmp(key, op->key, key_len) == 0)
+				goto key_is_found;
+			mp_next(&pos);
+		}
+	}
+	*res = NULL;
+	return 0;
+
+key_is_found:
+	tmp = pos;
+	mp_next(&tmp);
+	if (i_tail_size == 0 && i->key == NULL) {
+		/*
+		 * Looks like the needed key was found from the
+		 * beginning of tail of an item without a key.
+		 * This is actually good, because this item can
+		 * be transformed instead of a new item creation.
+		 */
+		i->key = op->key;
+		i->key_len = op->key_len;
+		i->field.data = pos;
+		i->field.size = tmp - pos;
+		i->tail_size = end - tmp;
+		new_item = i;
+	} else {
+		new_item = update_map_new_item(field, UPDATE_NOP, op->key,
+					       op->key_len, pos, tmp - pos,
+					       end - tmp);
+		if (new_item == NULL)
+			return -1;
+		i->tail_size = i_tail_size;
+	}
+	*res = new_item;
+	return 0;
+}
+
+/**
+ * The same as optional extractor, but the field to update should
+ * exist. It is the case of all scalar operations (except '=' - it
+ * can work as insert).
+ */
+static inline struct update_map_item *
+update_map_extract_item(struct update_field *field, struct update_op *op)
+{
+	assert(field->type == UPDATE_MAP);
+	struct update_map_item *res;
+	if (update_map_extract_opt_item(field, op, &res) != 0)
+		return NULL;
+	if (res == NULL)
+		update_err_no_such_field(op);
+	return res;
+}
+
+int
+do_op_map_insert(struct update_op *op, struct update_field *field)
+{
+	assert(field->type == UPDATE_MAP);
+	struct update_map_item *item;
+	if (update_map_extract_opt_item(field, op, &item) != 0)
+		return -1;
+	if (! update_op_is_term(op)) {
+		if (item == NULL)
+			return update_err_no_such_field(op);
+		op->token_type = JSON_TOKEN_END;
+		return do_op_insert(op, &item->field);
+	}
+	if (item != NULL)
+		return update_err_duplicate(op);
+	++field->map.size;
+	item = update_map_new_item(field, UPDATE_NOP, op->key, op->key_len,
+				   op->arg.set.value, op->arg.set.length, 0);
+	return item != NULL ? 0 : -1;
+}
+
+int
+do_op_map_set(struct update_op *op, struct update_field *field)
+{
+	assert(field->type == UPDATE_MAP);
+	struct update_map_item *item;
+	if (update_map_extract_opt_item(field, op, &item) != 0)
+		return -1;
+	if (! update_op_is_term(op)) {
+		if (item == NULL)
+			return update_err_no_such_field(op);
+		op->token_type = JSON_TOKEN_END;
+		return do_op_set(op, &item->field);
+	}
+	if (item != NULL) {
+		op->new_field_len = op->arg.set.length;
+		/* Ignore the previous op, if any. */
+		item->field.type = UPDATE_SCALAR;
+		item->field.scalar.op = op;
+		return 0;
+	}
+	++field->map.size;
+	item = update_map_new_item(field, UPDATE_NOP, op->key, op->key_len,
+				   op->arg.set.value, op->arg.set.length, 0);
+	return item != NULL ? 0 : -1;
+}
+
+int
+do_op_map_delete(struct update_op *op, struct update_field *field)
+{
+	assert(field->type == UPDATE_MAP);
+	struct update_map_item *item;
+	if (update_map_extract_opt_item(field, op, &item) != 0)
+		return -1;
+	if (! update_op_is_term(op)) {
+		if (item == NULL)
+			return update_err_no_such_field(op);
+		op->token_type = JSON_TOKEN_END;
+		return do_op_delete(op, &item->field);
+	}
+	if (op->arg.del.count != 1)
+		return update_err_delete1(op);
+	if (item == NULL)
+		return 0;
+	/*
+	 * Note, even if tail size becomes 0, this item is not
+	 * deleted. This is because items are linked via stailq,
+	 * elements of which can't be deleted as simple as that.
+	 * But it is not a big deal, because '#' is a really rare
+	 * operation.
+	 * Why just a next key from the tail can't be decoded?
+	 * Why key should be NULL here? This is because the JSON
+	 * updates allow to update a map containing non-string
+	 * keys. If the next key is not a string, it can't be
+	 * saved as key of the item. Therefore, it is better not
+	 * to touch unchanged tails unless a new update operation
+	 * needs it.
+	 */
+	item->key = NULL;
+	item->key_len = 0;
+	item->field.data += item->field.size;
+	item->field.size = 0;
+	item->field.type = UPDATE_NOP;
+	--field->map.size;
+	return 0;
+}
+
+#define DO_SCALAR_OP_GENERIC(op_type)						\
+int										\
+do_op_map_##op_type(struct update_op *op, struct update_field *field)		\
+{										\
+	assert(field->type == UPDATE_MAP);					\
+	struct update_map_item *item =						\
+		update_map_extract_item(field, op);				\
+	if (item == NULL)							\
+		return -1;							\
+	if (! update_op_is_term(op)) {						\
+		op->token_type = JSON_TOKEN_END;				\
+		return do_op_##op_type(op, &item->field);			\
+	}									\
+	if (item->field.type != UPDATE_NOP)					\
+		return update_err_double(op);					\
+	if (update_op_do_##op_type(op, item->field.data) != 0)			\
+		return -1;							\
+	item->field.type = UPDATE_SCALAR;					\
+	item->field.scalar.op = op;						\
+	return 0;								\
+}
+
+DO_SCALAR_OP_GENERIC(arith)
+
+DO_SCALAR_OP_GENERIC(bit)
+
+DO_SCALAR_OP_GENERIC(splice)
+
+int
+update_map_create(struct update_field *field, const char *header,
+		  const char *data, const char *data_end, int field_count)
+{
+	field->type = UPDATE_MAP;
+	field->data = header;
+	field->size = data_end - header;
+	field->map.size = field_count;
+	stailq_create(&field->map.items);
+	if (field_count == 0)
+		return 0;
+	struct update_map_item *first =
+		update_map_new_item(field, UPDATE_NOP, NULL, 0, data, 0,
+				    data_end - data);
+	return first != NULL ? 0 : -1;
+}
+
+int
+update_map_create_with_child(struct update_field *field,
+			     const struct update_field *child, const char *key,
+			     uint32_t key_len, const char *header)
+{
+	field->type = UPDATE_MAP;
+	field->data = header;
+	stailq_create(&field->map.items);
+
+	const char *pos = header;
+	uint32_t i, ikey_len, field_count = mp_decode_map(&pos);
+	const char *begin = pos;
+	struct update_map_item *item = NULL;
+	for (i = 0; i < field_count; ++i) {
+		if (mp_typeof(*pos) != MP_STR) {
+			mp_next(&pos);
+			mp_next(&pos);
+			continue;
+		}
+		const char *before_key = pos;
+		const char *ikey = mp_decode_str(&pos, &ikey_len);
+		if (ikey_len == key_len && memcmp(ikey, key, key_len) == 0) {
+			item = update_map_new_item(field, UPDATE_NOP, NULL, 0,
+						   begin, 0,
+						   before_key - begin);
+			if (item == NULL)
+				return -1;
+			++i;
+			break;
+		}
+		mp_next(&pos);
+	}
+	/*
+	 * When a map is initialized with an existing child, it
+	 * means, that it was already found earlier. I.e. it is
+	 * impossible to miss it.
+	 */
+	assert(item != NULL);
+	const char *data = pos;
+	mp_next(&pos);
+	uint32_t data_size = pos - data;
+	for (; i < field_count; ++i) {
+		mp_next(&pos);
+		mp_next(&pos);
+	}
+	item = update_map_item_alloc();
+	if (item == NULL)
+		return -1;
+	item->field = *child;
+	update_map_create_item(field, item, child->type, key, key_len, data,
+			       data_size, pos - data - data_size);
+	field->map.size = field_count;
+	field->size = pos - header;
+	return 0;
+}
+
+uint32_t
+update_map_sizeof(struct update_field *field)
+{
+	assert(field->type == UPDATE_MAP);
+	uint32_t res = mp_sizeof_map(field->map.size);
+	struct update_map_item *i;
+	stailq_foreach_entry(i, &field->map.items, in_items) {
+		res += i->tail_size;
+		if (i->key != NULL) {
+			res += mp_sizeof_str(i->key_len) +
+			       update_field_sizeof(&i->field);
+		}
+	}
+	return res;
+}
+
+uint32_t
+update_map_store(struct update_field *field, char *out, char *out_end)
+{
+	assert(field->type == UPDATE_MAP);
+	char *out_begin = out;
+	out = mp_encode_map(out, field->map.size);
+	struct update_map_item *i;
+	/*
+	 * This is the trick about saving updated keys before
+	 * others. The first cycle doesn't save unchanged tails.
+	 */
+	stailq_foreach_entry(i, &field->map.items, in_items) {
+		if (i->key != NULL) {
+			out = mp_encode_str(out, i->key, i->key_len);
+			out += update_field_store(&i->field, out, out_end);
+		}
+	}
+	stailq_foreach_entry(i, &field->map.items, in_items) {
+		memcpy(out, i->field.data + i->field.size, i->tail_size);
+		out += i->tail_size;
+	}
+	assert(out <= out_end);
+	return out - out_begin;
+}
diff --git a/src/box/update/update_route.c b/src/box/update/update_route.c
index f1d2ced65..41fcf8b3b 100644
--- a/src/box/update/update_route.c
+++ b/src/box/update/update_route.c
@@ -109,6 +109,36 @@ update_route_branch_array(struct update_field *next_hop,
 	return op->meta->do_cb(op, next_hop);
 }
 
+/**
+ * Do the actual branch, but by a map and a key in that map. Works
+ * exactly the same as the array-counterpart.
+ */
+static int
+update_route_branch_map(struct update_field *next_hop,
+			const struct update_field *child, const char *key,
+			int key_len, const char *parent)
+{
+	struct update_op *op = child->bar.op;
+	if (child->type != UPDATE_BAR || child->bar.path_len > 0 ||
+	    (op->opcode != '!' && op->opcode != '#')) {
+		return update_map_create_with_child(next_hop, child, key,
+						    key_len, parent);
+	}
+	op->token_type = JSON_TOKEN_STR;
+	op->key = key;
+	op->key_len = key_len;
+	const char *data = parent;
+	uint32_t field_count = mp_decode_map(&data);
+	const char *end = data;
+	for (uint32_t i = 0; i < field_count; ++i) {
+		mp_next(&end);
+		mp_next(&end);
+	}
+	if (update_map_create(next_hop, parent, data, end, field_count) != 0)
+		return -1;
+	return op->meta->do_cb(op, next_hop);
+}
+
 struct update_field *
 update_route_branch(struct update_field *field, struct update_op *new_op)
 {
@@ -205,19 +235,6 @@ update_route_branch(struct update_field *field, struct update_op *new_op)
 	} else {
 		next_hop = field;
 	}
-	if (type == MP_MAP) {
-		diag_set(ClientError, ER_UNSUPPORTED, "update",
-			 "path intersection on map");
-		return NULL;
-	}
-	if (new_token.type != JSON_TOKEN_NUM) {
-		update_err(new_op, "can not update array by non-integer index");
-		return NULL;
-	}
-	if (type != MP_ARRAY) {
-		update_err_no_such_field(new_op);
-		return NULL;
-	}
 
 	int path_offset = old_path_lexer.offset;
 	struct update_field child = *field;
@@ -245,11 +262,34 @@ update_route_branch(struct update_field *field, struct update_op *new_op)
 		 */
 	}
 
-	new_op->token_type = JSON_TOKEN_NUM;
-	new_op->field_no = new_token.num;
-	if (update_route_branch_array(next_hop, &child, old_token.num,
-				      parent) != 0)
+	if (type == MP_ARRAY) {
+		if (new_token.type != JSON_TOKEN_NUM) {
+			update_err(new_op, "can not update array by "\
+				   "non-integer index");
+			return NULL;
+		}
+		new_op->token_type = JSON_TOKEN_NUM;
+		new_op->field_no = new_token.num;
+		if (update_route_branch_array(next_hop, &child, old_token.num,
+					      parent) != 0)
+			return NULL;
+	} else if (type == MP_MAP) {
+		if (new_token.type != JSON_TOKEN_STR) {
+			update_err(new_op, "can not update map by "\
+				   "non-string key");
+			return NULL;
+		}
+		new_op->token_type = JSON_TOKEN_STR;
+		new_op->key = new_token.str;
+		new_op->key_len = new_token.len;
+		if (update_route_branch_map(next_hop, &child, old_token.str,
+					    old_token.len, parent) != 0)
+			return NULL;
+	} else {
+		update_err_no_such_field(new_op);
 		return NULL;
+	}
+
 	if (! transform_root) {
 		field->type = UPDATE_ROUTE;
 		field->route.path = old_path;
diff --git a/test/box/update.result b/test/box/update.result
index 2c5978d8a..4cbee48e6 100644
--- a/test/box/update.result
+++ b/test/box/update.result
@@ -1471,6 +1471,158 @@ t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
 ---
 - error: Field ''[4][3][2]'' was not found in the tuple
 ...
+--
+-- Intersecting map updates.
+--
+t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
+---
+- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [17, 18,
+        19]], 20, {'b': 22, 'a': 21, 'c': {'d': 2300, 'e': 2400}}, 25]]
+...
+-- Fill an empty map.
+t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
+---
+- [1, {'k': 100, 'v': 200}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14],
+      15, 16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
+...
+t = {}                                  \
+t[1] = 1                                \
+t[2] = {                                \
+    a = 1,                              \
+    b = 2,                              \
+    c = {3, 4, 5},                      \
+    d = {                               \
+        [0] = 0,                        \
+        e = 6,                          \
+        [true] = false,                 \
+        f = 7,                          \
+        [-1] = -1,                      \
+    },                                  \
+    g = {                               \
+        {k = 8, v = 9},                 \
+        [box.NULL] = box.NULL,          \
+        {k = 10, v = 11},               \
+        {k = '12str', v = '13str'},     \
+    },                                  \
+    h = {                               \
+        [-1] = {{{{-1}, -1}, -1}, -1},  \
+        i = {                           \
+            j = 14,                     \
+            k = 15,                     \
+        },                              \
+        m = 16,                         \
+    }                                   \
+}
+---
+...
+t[3] = {}
+---
+...
+t = box.tuple.new(t)
+---
+...
+-- Insert and delete from the same map at once.
+t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
+---
+- [1, {'a': 1, 'n': 17, 'd': {0: 0, -1: -1, true: false, 'f': 7, 'e': 6}, 'h': {'i': {
+        'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3, 4, 5], 'g': {
+      1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+      null: null}}, []]
+...
+-- Delete a not existing key.
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
+---
+- [1, {'b': 2, 'a': 1, 'd': {0: 0, 'f': 7, 'e': 6, -1: -1, true: false, 'g': 6000},
+    'h': {'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3,
+      4, 5], 'g': {1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+      null: null}}, []]
+...
+-- Scalar operations.
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
+---
+- [1, {'b': 3, 'c': [3, 4, 5], 'd': {0: 0, true: false, 'f': 8, -1: -1, 'e': 7}, 'h': {
+      'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'a': 1, 'g': {
+      1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
+      null: null}}, []]
+...
+-- Errors.
+-- This operation array creates an update tree with several full
+-- featured maps, not bars. It is required to cover some complex
+-- cases about fields extraction.
+ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
+---
+...
+-- When a map update extracts a next token on demand, it should
+-- reject bad json, obviously. Scalar and '!'/'#' operations are
+-- implemented separately and tested both.
+ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+ops[#ops][1] = '!'
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+ops[#ops][1] = '#'
+---
+...
+ops[#ops][3] = 1
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
+...
+-- Key extractor should reject any attempt to use non-string keys.
+ops[#ops] = {'-', '[2].h.i[20]', -1}
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
+    key'
+...
+ops[#ops][1] = '='
+---
+...
+t:update(ops)
+---
+- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
+    key'
+...
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
+---
+- error: Field ''[2].d.k'' was not found in the tuple
+...
+t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
+---
+- error: Field ''[2].d.g.h'' was not found in the tuple
+...
+t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
+---
+- error: 'Field ''[2].d.e'' UPDATE error: the key exists already'
+...
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
+---
+- error: 'Field ''[2].d.old'' UPDATE error: can delete only 1 field from a map in
+    a row'
+...
+-- '!'/'=' can be used to create a field, but only if it is a
+-- tail. These operations can't create the whole path.
+t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
+---
+- error: Field ''[2].d.new.new'' was not found in the tuple
+...
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
+---
+- error: Field ''[2].d.old.old'' was not found in the tuple
+...
 s:drop()
 ---
 ...
diff --git a/test/box/update.test.lua b/test/box/update.test.lua
index d91a2e95c..6fed818f4 100644
--- a/test/box/update.test.lua
+++ b/test/box/update.test.lua
@@ -515,4 +515,79 @@ t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]',
 -- during branching.
 t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
 
+--
+-- Intersecting map updates.
+--
+t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
+-- Fill an empty map.
+t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
+
+t = {}                                  \
+t[1] = 1                                \
+t[2] = {                                \
+    a = 1,                              \
+    b = 2,                              \
+    c = {3, 4, 5},                      \
+    d = {                               \
+        [0] = 0,                        \
+        e = 6,                          \
+        [true] = false,                 \
+        f = 7,                          \
+        [-1] = -1,                      \
+    },                                  \
+    g = {                               \
+        {k = 8, v = 9},                 \
+        [box.NULL] = box.NULL,          \
+        {k = 10, v = 11},               \
+        {k = '12str', v = '13str'},     \
+    },                                  \
+    h = {                               \
+        [-1] = {{{{-1}, -1}, -1}, -1},  \
+        i = {                           \
+            j = 14,                     \
+            k = 15,                     \
+        },                              \
+        m = 16,                         \
+    }                                   \
+}
+t[3] = {}
+t = box.tuple.new(t)
+-- Insert and delete from the same map at once.
+t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
+-- Delete a not existing key.
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
+
+-- Scalar operations.
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
+
+-- Errors.
+-- This operation array creates an update tree with several full
+-- featured maps, not bars. It is required to cover some complex
+-- cases about fields extraction.
+ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
+-- When a map update extracts a next token on demand, it should
+-- reject bad json, obviously. Scalar and '!'/'#' operations are
+-- implemented separately and tested both.
+ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
+t:update(ops)
+ops[#ops][1] = '!'
+t:update(ops)
+ops[#ops][1] = '#'
+ops[#ops][3] = 1
+t:update(ops)
+-- Key extractor should reject any attempt to use non-string keys.
+ops[#ops] = {'-', '[2].h.i[20]', -1}
+t:update(ops)
+ops[#ops][1] = '='
+t:update(ops)
+
+t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
+t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
+t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
+-- '!'/'=' can be used to create a field, but only if it is a
+-- tail. These operations can't create the whole path.
+t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
+t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
+
 s:drop()
-- 
2.20.1 (Apple Git-117)

      parent reply	other threads:[~2019-08-31 21:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-31 21:35 [tarantool-patches] [PATCH v2 0/8] JSON updates Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 1/8] tuple: expose JSON go_to_key and go_to_index functions Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 2/8] tuple: rework updates to improve code extendibility Vladislav Shpilevoy
     [not found]   ` <20190903192059.GE15611@atlas>
     [not found]     ` <6ee759cf-a975-e6a9-6f52-f855958ffe06@tarantool.org>
     [not found]       ` <20191005132055.GD3913@atlas>
     [not found]         ` <20191005135037.GJ3913@atlas>
2019-10-19 15:11           ` [Tarantool-patches] [tarantool-patches] " Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 3/8] json: lexer_eof and token_cmp helper functions Vladislav Shpilevoy
     [not found]   ` <20190903192433.GF15611@atlas>
     [not found]     ` <f5612e04-dc56-f4bd-1298-c5841ac909f5@tarantool.org>
     [not found]       ` <20191005132231.GE3913@atlas>
     [not found]         ` <20191005135014.GI3913@atlas>
2019-10-19 15:08           ` [Tarantool-patches] [tarantool-patches] " Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 4/8] tuple: account the whole array in field.data and size Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 5/8] tuple: enable JSON bar updates Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 6/8] tuple: make update operation tokens consumable Vladislav Shpilevoy
2019-08-31 21:35 ` [tarantool-patches] [PATCH v2 7/8] tuple: JSON updates support intersection by arrays Vladislav Shpilevoy
2019-08-31 21:35 ` Vladislav Shpilevoy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=067da13b80758c16a6eb905968dac4fa1e952c83.1567287197.git.v.shpilevoy@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] [PATCH v2 8/8] tuple: JSON updates support intersection by maps' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox