[Tarantool-patches] [PATCH 3/3] tuple: JSON path update intersection at maps
Sergey Ostanevich
sergos at tarantool.org
Fri Dec 27 17:57:04 MSK 2019
Hi!
Thanks for the patch!
Just the spaces after unaries - otherwise LGTM.
If I got it right, the limitation from the #2 patch:
| This is not allowed yet:
|
| [1][2][3].a.b.c = 20
| [1][2][3].a.e.f = 30
|
| First difference is 'b' vs 'e' - intersection by a map,
| not ok.
is removed. Could you mention this in the description?
Regards,
Sergos
On 23 Dec 23:41, Vladislav Shpilevoy wrote:
> Previous commits introduced isolated JSON updates, and then
> allowed intersection at array. This one completes the puzzle,
> adding intersection at maps. Now JSON updates are fully
> available.
>
> Closes #1261
>
> @TarantoolBot document
> Title: JSON updates
> Tuple/space/index:update/upsert now support JSON paths. All the
> same paths as allowed in tuple["..."].
>
> Example:
> box.cfg{}
> format = {}
> format[1] = {'field1', 'unsigned'}
> format[2] = {'field2', 'map'}
> format[3] = {'field3', 'array'}
> format[4] = {'field4', 'string', is_nullable = true}
> s = box.schema.create_space('test', {format = format})
> _ = s:create_index('pk')
> t = {
> 1,
> {
> key1 = 'value',
> key2 = 10
> },
> {
> 2,
> 3,
> {key3 = 20}
> }
> }
> t = s:replace(t)
>
> tarantool> t:update({{'=', 'field2.key1', 'new_value'}})
> ---
> - [1, {'key1': 'new_value', 'key2': 10}, [2, 3, {'key3': 20}]]
> ...
>
> tarantool> t:update({{'+', 'field3[2]', 1}})
> ---
> - [1, {'key1': 'value', 'key2': 10}, [2, 4, {'key3': 20}]]
> ...
>
> tarantool> s:update({1}, {{'!', 'field4', 'inserted value'}})
> ---
> - [1, {'key1': 'value', 'key2': 10}, [2, 3, {'key3': 20}], 'inserted value']
> ...
>
> tarantool> s:update({1}, {{'#', '[2].key2', 1}, {'=', '[3][3].key4', 'value4'}})
> ---
> - [1, {'key1': 'value'}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
> ...
>
> tarantool> s:upsert({1, {k = 'v'}, {}}, {{'#', '[2].key1', 1}})
> ---
> ...
>
> tarantool> s:select{}
> ---
> - - [1, {}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
> ...
>
> Note, that there is the same rule, as in tuple field access by
> JSON, for field names looking like JSON paths. The rule is that
> firstly the whole path is interpreted as a field name. If such a
> name does not exist, then it is treated as a path. For example,
> if there is a field name 'field.name.like.json', then this update
>
> <obj>:update({..., 'field.name.like.json', ...})
>
> will update this field, instead of keys 'field' -> 'name' ->
> 'like' -> 'json'. If such a name is needed as a part of a bigger
> path, then it should be wrapped in quotes and []:
>
> <obj>:update({..., '["field.name.like.json"].next.fields', ...})
>
> There are some new rules for JSON updates:
>
> - Operation '!' can't be used to create all intermediate nodes of
> a path. For example, {'!', 'field1[1].field3', ...} can't
> create fields 'field1' and '[1]', they should exist.
>
> - Operation '#', when applied to maps, can't delete more than one
> key at once. That is, its argument should be always 1 for maps.
>
> {'#', 'field1.field2', 1} - this is allowed;
> {'#', 'field1.field2', 10} - this is not.
>
> That limitation originates from a problem, that keys in a map
> are not ordered anyhow, and '#' with more than 1 key would lead
> to undefined behaviour.
>
> - Operation '!' on maps can't create a key, if it exists already.
>
> - If a map contains non-string keys (booleans, numbers, maps,
> arrays - anything), then these keys can't be updated via JSON
> paths. But it is still allowed to update string keys in such a
> map.
>
> Why JSON updates are good, and should be preferred when only a
> part of a tuple needs to be updated?
>
> - They consume less space in WAL, because for an update only its
> keys, operations, and arguments are stored. It is cheaper to
> store update of one deep field, than the whole tuple.
>
> - They are faster. Firstly, this is because they are implemented
> in C, and have no problems with Lua GC and dynamic typing.
> Secondly, some cases of JSON paths are highly optimized. For
> example, an update with a single JSON path costs O(1) memory
> regardless of how deep that path goes (not counting update
> arguments).
>
> - They are available from remote clients, as well as any other
> DML. Before JSON updates to update one deep part of a tuple it
> would be necessary to download that tuple, update it in memory,
> send it back - 2 network hops. With JSON paths it can be 1 when
> the update can be described in paths.
> ---
> src/box/CMakeLists.txt | 1 +
> src/box/xrow_update_field.c | 4 +
> src/box/xrow_update_field.h | 85 ++++++-
> src/box/xrow_update_map.c | 453 ++++++++++++++++++++++++++++++++++++
> src/box/xrow_update_route.c | 47 +++-
> test/box/update.result | 152 ++++++++++++
> test/box/update.test.lua | 75 ++++++
> 7 files changed, 813 insertions(+), 4 deletions(-)
> create mode 100644 src/box/xrow_update_map.c
>
> diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
> index f99efc503..2022b378b 100644
> --- a/src/box/CMakeLists.txt
> +++ b/src/box/CMakeLists.txt
> @@ -45,6 +45,7 @@ add_library(tuple STATIC
> xrow_update_array.c
> xrow_update_bar.c
> xrow_update_route.c
> + xrow_update_map.c
> tuple_compare.cc
> tuple_extract_key.cc
> tuple_hash.cc
> diff --git a/src/box/xrow_update_field.c b/src/box/xrow_update_field.c
> index 3a4aa6c90..a8db31768 100644
> --- a/src/box/xrow_update_field.c
> +++ b/src/box/xrow_update_field.c
> @@ -116,6 +116,8 @@ xrow_update_field_sizeof(struct xrow_update_field *field)
> return xrow_update_bar_sizeof(field);
> case XUPDATE_ROUTE:
> return xrow_update_route_sizeof(field);
> + case XUPDATE_MAP:
> + return xrow_update_map_sizeof(field);
> default:
> unreachable();
> }
> @@ -145,6 +147,8 @@ xrow_update_field_store(struct xrow_update_field *field, char *out,
> return xrow_update_bar_store(field, out, out_end);
> case XUPDATE_ROUTE:
> return xrow_update_route_store(field, out, out_end);
> + case XUPDATE_MAP:
> + return xrow_update_map_store(field, out, out_end);
> default:
> unreachable();
> }
> diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
> index f07b5a649..6b2a0d999 100644
> --- a/src/box/xrow_update_field.h
> +++ b/src/box/xrow_update_field.h
> @@ -32,6 +32,7 @@
> */
> #include "trivia/util.h"
> #include "tt_static.h"
> +#include "salad/stailq.h"
> #include "json/json.h"
> #include "bit/int96.h"
> #include "mp_decimal.h"
> @@ -254,7 +255,8 @@ enum xrow_update_type {
> * for example, when a rope is split in two parts: not
> * changed left range of fields, and a right range with
> * its first field changed. In this case the left range is
> - * NOP.
> + * NOP. And when a map is updated and split into rages,
> + * when only certain points are not NOP.
> */
> XUPDATE_NOP,
> /**
> @@ -292,6 +294,11 @@ enum xrow_update_type {
> * '[2] = 30', '[3] = true'.
> */
> XUPDATE_ROUTE,
> + /**
> + * Field is an updated map. Check item list for updates of
> + * individual fields.
> + */
> + XUPDATE_MAP,
> };
>
> /**
> @@ -383,6 +390,48 @@ struct xrow_update_field {
> /** Update subtree. */
> struct xrow_update_field *next_hop;
> } route;
> + /**
> + * The field is an updated map. Individual fields
> + * are stored very similar to array update and its
> + * rope nodes. Each item is a key, a value, and a
> + * tail of unchanged key-value pairs. The items
> + * are stored in a list. But the list is not
> + * sorted anyhow by keys, because it does not
> + * really make any sense:
> + *
> + * 1) Keys in MessagePack are not sorted anyway,
> + * and any kind of search would not be possible
> + * even if they were sorted. Sort of a map
> + * would require Nlog(N) time and N memory even
> + * if only a few fields were updated.
> + *
> + * 2) Double scalar update of the same key is not
> + * possible.
> + *
> + * Although there is something which can be and is
> + * optimized. When a key is updated first time,
> + * it is moved to the beginning of the list, and
> + * after all operations are done, it is stored
> + * in the result tuple before unchanged fields. On
> + * a second update of the same tuple it is found
> + * immediately.
> + */
> + struct {
> + /**
> + * List of map update items. Each item is
> + * a key, a value, and an unchanged tail.
> + */
> + struct stailq items;
> + /**
> + * Number of key-value pairs in the list.
> + * Note, it is not a number of items, but
> + * exactly number of key-value pairs. It
> + * is used to store MessagePack map header
> + * without decoding each item again just
> + * to learn the size.
> + */
> + int size;
> + } map;
> };
> };
>
> @@ -506,6 +555,38 @@ OP_DECL_GENERIC(array)
>
> /* }}} xrow_update_field.array */
>
> +/* {{{ xrow_update_field.map */
> +
> +/**
> + * Initialize @a field as a map to update.
> + * @param[out] field Field to initialize.
> + * @param header Header of the MessagePack map @a data.
> + * @param data MessagePack data of the map to update.
> + * @param data_end End of @a data.
> + * @param field_count Key-value pair count in @data.
> + *
> + * @retval 0 Success.
> + * @retval -1 Error.
> + */
> +int
> +xrow_update_map_create(struct xrow_update_field *field, const char *header,
> + const char *data, const char *data_end, int field_count);
> +
> +/**
> + * Create a map update with an existing child. Motivation is
> + * exactly the same as with a similar array constructor. It allows
> + * to avoid unnecessary MessagePack decoding.
> + */
> +int
> +xrow_update_map_create_with_child(struct xrow_update_field *field,
> + const char *header,
> + const struct xrow_update_field *child,
> + const char *key, uint32_t key_len);
> +
> +OP_DECL_GENERIC(map)
> +
> +/* }}} xrow_update_field.map */
> +
> /* {{{ update_field.bar */
>
> OP_DECL_GENERIC(bar)
> @@ -587,6 +668,8 @@ xrow_update_op_do_field_##op_type(struct xrow_update_op *op, \
> return xrow_update_op_do_bar_##op_type(op, field); \
> case XUPDATE_ROUTE: \
> return xrow_update_op_do_route_##op_type(op, field); \
> + case XUPDATE_MAP: \
> + return xrow_update_op_do_map_##op_type(op, field); \
> default: \
> unreachable(); \
> } \
> diff --git a/src/box/xrow_update_map.c b/src/box/xrow_update_map.c
> new file mode 100644
> index 000000000..22fabff96
> --- /dev/null
> +++ b/src/box/xrow_update_map.c
> @@ -0,0 +1,453 @@
> +/*
> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + * copyright notice, this list of conditions and the
> + * following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + * copyright notice, this list of conditions and the following
> + * disclaimer in the documentation and/or other materials
> + * provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +#include "xrow_update_field.h"
> +#include "fiber.h"
> +#include "small/region.h"
> +#include "tuple_format.h"
> +
> +/**
> + * Descriptor of one updated key-value pair. Besides updated data
> + * it contains a tail with unchanged pairs, just so as not to
> + * create a separate object for them, and to be similar with array
> + * update items.
> + */
> +struct xrow_update_map_item {
> + /**
> + * Updated key. Can be NULL. In such a case this item
> + * contains only an unchanged tail. A key can be nullified
> + * if it was removed from the map, or when a map is just
> + * created and has no any update yet.
> + */
> + const char *key;
> + /** Length of @a key. */
> + uint32_t key_len;
> + /** Updated value. */
> + struct xrow_update_field field;
> + /** Link in the list of updated map keys. */
> + struct stailq_entry in_items;
> + /**
> + * Size in bytes of unchanged tail data. It goes right
> + * after @a field.data.
> + */
> + uint32_t tail_size;
> +};
> +
> +static inline struct xrow_update_map_item *
> +xrow_update_map_item_alloc(void)
> +{
> + struct xrow_update_map_item *item = (struct xrow_update_map_item *)
> + region_alloc(&fiber()->gc, sizeof(*item));
> + if (item == NULL)
> + diag_set(OutOfMemory, sizeof(*item), "region_alloc", "item");
> + return item;
> +}
> +
> +static void
> +xrow_update_map_create_item(struct xrow_update_field *field,
> + struct xrow_update_map_item *item,
> + enum xrow_update_type type, const char *key,
> + uint32_t key_len, const char *data,
> + uint32_t data_size, uint32_t tail_size)
> +{
> + assert(field->type == XUPDATE_MAP);
> + item->key = key;
> + item->key_len = key_len;
> + item->field.type = type;
> + item->field.data = data;
> + item->field.size = data_size,
> + item->tail_size = tail_size;
> + /*
> + * Each time a new item it created it is stored in the
> + * head of update map item list. It helps in case the
> + * tuple is regularly updated, because on all next updates
> + * this key will be found from the very beginning of the
> + * map.
> + */
> + stailq_add_entry(&field->map.items, item, in_items);
> +}
> +
> +static inline struct xrow_update_map_item *
> +xrow_update_map_new_item(struct xrow_update_field *field,
> + enum xrow_update_type type, const char *key,
> + uint32_t key_len, const char *data, uint32_t data_size,
> + uint32_t tail_size)
> +{
> + struct xrow_update_map_item *item = xrow_update_map_item_alloc();
> + if (item != NULL) {
> + xrow_update_map_create_item(field, item, type, key, key_len,
> + data, data_size, tail_size);
> + }
> + return item;
> +}
> +
> +/**
> + * Find an update item to which @a op should be applied. The
> + * target field may do not exist, but at least its parent should.
> + */
> +static int
> +xrow_update_map_extract_opt_item(struct xrow_update_field *field,
> + struct xrow_update_op *op,
> + struct xrow_update_map_item **res)
> +{
> + assert(field->type == XUPDATE_MAP);
> + if (op->token_type == JSON_TOKEN_END) {
> + if (xrow_update_op_consume_token(op) != 0)
> + return -1;
> + if (op->token_type != JSON_TOKEN_STR) {
> + return xrow_update_err(op, "can't update a map by not "\
> + "a string key");
> + }
> + }
> + struct stailq *items = &field->map.items;
> + struct xrow_update_map_item *i, *new_item;
> + /*
> + * Firstly, try to find the key among already updated
> + * ones. Perhaps, the needed node is just an intermediate
> + * key of a bigger JSON path, and there are many updates
> + * passing this key, so it should be here for all except
> + * first updates.
> + */
> + stailq_foreach_entry(i, items, in_items) {
> + if (i->key != NULL && i->key_len == op->key_len &&
> + memcmp(i->key, op->key, i->key_len) == 0) {
> + *res = i;
> + return 0;
> + }
> + }
> + /*
> + * Slow path - the key is updated first time, need to
> + * decode tails.
> + */
> + uint32_t key_len, i_tail_size;
> + const char *pos, *key, *end, *tmp, *begin;
> + stailq_foreach_entry(i, items, in_items) {
> + begin = i->field.data + i->field.size;
> + pos = begin;
> + end = begin + i->tail_size;
> + i_tail_size = 0;
> + while(pos < end) {
> + if (mp_typeof(*pos) != MP_STR) {
> + mp_next(&pos);
> + mp_next(&pos);
> + continue;
> + }
> + i_tail_size = pos - begin;
> + key = mp_decode_str(&pos, &key_len);
> + if (key_len == op->key_len &&
> + memcmp(key, op->key, key_len) == 0)
> + goto key_is_found;
> + mp_next(&pos);
> + }
> + }
> + *res = NULL;
> + return 0;
> +
> +key_is_found:
> + tmp = pos;
> + mp_next(&tmp);
> + if (i_tail_size == 0 && i->key == NULL) {
> + /*
> + * Looks like the needed key was found from the
> + * beginning of tail of an item without a key.
> + * This is actually good, because this item can
> + * be transformed instead of a new item creation.
> + */
> + i->key = op->key;
> + i->key_len = op->key_len;
> + i->field.data = pos;
> + i->field.size = tmp - pos;
> + i->tail_size = end - tmp;
> + new_item = i;
> + } else {
> + new_item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
> + op->key_len, pos, tmp - pos,
> + end - tmp);
> + if (new_item == NULL)
> + return -1;
> + i->tail_size = i_tail_size;
> + }
> + *res = new_item;
> + return 0;
> +}
> +
> +/**
> + * The same as optional extractor, but the field to update should
> + * exist. It is the case of all scalar operations (except '=' - it
> + * can work as insert).
> + */
> +static inline struct xrow_update_map_item *
> +xrow_update_map_extract_item(struct xrow_update_field *field,
> + struct xrow_update_op *op)
> +{
> + assert(field->type == XUPDATE_MAP);
> + struct xrow_update_map_item *res;
> + if (xrow_update_map_extract_opt_item(field, op, &res) != 0)
> + return NULL;
> + if (res == NULL)
> + xrow_update_err_no_such_field(op);
> + return res;
> +}
> +
> +int
> +xrow_update_op_do_map_insert(struct xrow_update_op *op,
> + struct xrow_update_field *field)
> +{
> + assert(field->type == XUPDATE_MAP);
> + struct xrow_update_map_item *item;
> + if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
> + return -1;
> + if (! xrow_update_op_is_term(op)) {
> + if (item == NULL)
> + return xrow_update_err_no_such_field(op);
> + op->token_type = JSON_TOKEN_END;
> + return xrow_update_op_do_field_insert(op, &item->field);
> + }
> + if (item != NULL)
> + return xrow_update_err_duplicate(op);
> + ++field->map.size;
> + item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
> + op->key_len, op->arg.set.value,
> + op->arg.set.length, 0);
> + return item != NULL ? 0 : -1;
> +}
> +
> +int
> +xrow_update_op_do_map_set(struct xrow_update_op *op,
> + struct xrow_update_field *field)
> +{
> + assert(field->type == XUPDATE_MAP);
> + struct xrow_update_map_item *item;
> + if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
> + return -1;
> + if (! xrow_update_op_is_term(op)) {
> + if (item == NULL)
> + return xrow_update_err_no_such_field(op);
> + op->token_type = JSON_TOKEN_END;
> + return xrow_update_op_do_field_set(op, &item->field);
> + }
> + if (item != NULL) {
> + op->new_field_len = op->arg.set.length;
> + /* Ignore the previous op, if any. */
> + item->field.type = XUPDATE_SCALAR;
> + item->field.scalar.op = op;
> + return 0;
> + }
> + ++field->map.size;
> + item = xrow_update_map_new_item(field, XUPDATE_NOP, op->key,
> + op->key_len, op->arg.set.value,
> + op->arg.set.length, 0);
> + return item != NULL ? 0 : -1;
> +}
> +
> +int
> +xrow_update_op_do_map_delete(struct xrow_update_op *op,
> + struct xrow_update_field *field)
> +{
> + assert(field->type == XUPDATE_MAP);
> + struct xrow_update_map_item *item;
> + if (xrow_update_map_extract_opt_item(field, op, &item) != 0)
> + return -1;
> + if (! xrow_update_op_is_term(op)) {
> + if (item == NULL)
> + return xrow_update_err_no_such_field(op);
> + op->token_type = JSON_TOKEN_END;
> + return xrow_update_op_do_field_delete(op, &item->field);
> + }
> + if (op->arg.del.count != 1)
> + return xrow_update_err_delete1(op);
> + if (item == NULL)
> + return 0;
> + /*
> + * Note, even if tail size becomes 0, this item is not
> + * deleted. This is because items are linked via stailq,
> + * elements of which can't be deleted as simple as that.
> + * But it is not a big deal, because '#' is a really rare
> + * operation.
> + * Why just a next key from the tail can't be decoded?
> + * Why key should be NULL here? This is because the JSON
> + * updates allow to update a map containing non-string
> + * keys. If the next key is not a string, it can't be
> + * saved as key of the item. Therefore, it is better not
> + * to touch unchanged tails unless a new update operation
> + * needs it.
> + */
> + item->key = NULL;
> + item->key_len = 0;
> + item->field.data += item->field.size;
> + item->field.size = 0;
> + item->field.type = XUPDATE_NOP;
> + --field->map.size;
> + return 0;
> +}
> +
> +#define DO_SCALAR_OP_GENERIC(op_type) \
> +int \
> +xrow_update_op_do_map_##op_type(struct xrow_update_op *op, \
> + struct xrow_update_field *field) \
> +{ \
> + assert(field->type == XUPDATE_MAP); \
> + struct xrow_update_map_item *item = \
> + xrow_update_map_extract_item(field, op); \
> + if (item == NULL) \
> + return -1; \
> + if (! xrow_update_op_is_term(op)) { \
> + op->token_type = JSON_TOKEN_END; \
> + return xrow_update_op_do_field_##op_type(op, &item->field); \
> + } \
> + if (item->field.type != XUPDATE_NOP) \
> + return xrow_update_err_double(op); \
> + if (xrow_update_op_do_##op_type(op, item->field.data) != 0) \
> + return -1; \
> + item->field.type = XUPDATE_SCALAR; \
> + item->field.scalar.op = op; \
> + return 0; \
> +}
> +
> +DO_SCALAR_OP_GENERIC(arith)
> +
> +DO_SCALAR_OP_GENERIC(bit)
> +
> +DO_SCALAR_OP_GENERIC(splice)
> +
> +int
> +xrow_update_map_create(struct xrow_update_field *field, const char *header,
> + const char *data, const char *data_end, int field_count)
> +{
> + field->type = XUPDATE_MAP;
> + field->data = header;
> + field->size = data_end - header;
> + field->map.size = field_count;
> + stailq_create(&field->map.items);
> + if (field_count == 0)
> + return 0;
> + struct xrow_update_map_item *first =
> + xrow_update_map_new_item(field, XUPDATE_NOP, NULL, 0, data, 0,
> + data_end - data);
> + return first != NULL ? 0 : -1;
> +}
> +
> +int
> +xrow_update_map_create_with_child(struct xrow_update_field *field,
> + const char *header,
> + const struct xrow_update_field *child,
> + const char *key, uint32_t key_len)
> +{
> + field->type = XUPDATE_MAP;
> + field->data = header;
> + stailq_create(&field->map.items);
> +
> + const char *pos = header;
> + uint32_t i, ikey_len, field_count = mp_decode_map(&pos);
> + const char *begin = pos;
> + struct xrow_update_map_item *item = NULL;
> + for (i = 0; i < field_count; ++i) {
> + if (mp_typeof(*pos) != MP_STR) {
> + mp_next(&pos);
> + mp_next(&pos);
> + continue;
> + }
> + const char *before_key = pos;
> + const char *ikey = mp_decode_str(&pos, &ikey_len);
> + if (ikey_len == key_len && memcmp(ikey, key, key_len) == 0) {
> + item = xrow_update_map_new_item(field, XUPDATE_NOP,
> + NULL, 0, begin, 0,
> + before_key - begin);
> + if (item == NULL)
> + return -1;
> + ++i;
> + break;
> + }
> + mp_next(&pos);
> + }
> + /*
> + * When a map is initialized with an existing child, it
> + * means, that it was already found earlier. I.e. it is
> + * impossible to miss it.
> + */
> + assert(item != NULL);
> + const char *data = pos;
> + mp_next(&pos);
> + uint32_t data_size = pos - data;
> + for (; i < field_count; ++i) {
> + mp_next(&pos);
> + mp_next(&pos);
> + }
> + item = xrow_update_map_item_alloc();
> + if (item == NULL)
> + return -1;
> + item->field = *child;
> + xrow_update_map_create_item(field, item, child->type, key, key_len,
> + data, data_size, pos - data - data_size);
> + field->map.size = field_count;
> + field->size = pos - header;
> + return 0;
> +}
> +
> +uint32_t
> +xrow_update_map_sizeof(struct xrow_update_field *field)
> +{
> + assert(field->type == XUPDATE_MAP);
> + uint32_t res = mp_sizeof_map(field->map.size);
> + struct xrow_update_map_item *i;
> + stailq_foreach_entry(i, &field->map.items, in_items) {
> + res += i->tail_size;
> + if (i->key != NULL) {
> + res += mp_sizeof_str(i->key_len) +
> + xrow_update_field_sizeof(&i->field);
> + }
> + }
> + return res;
> +}
> +
> +uint32_t
> +xrow_update_map_store(struct xrow_update_field *field, char *out, char *out_end)
> +{
> + assert(field->type == XUPDATE_MAP);
> + char *out_begin = out;
> + out = mp_encode_map(out, field->map.size);
> + struct xrow_update_map_item *i;
> + /*
> + * This is the trick about saving updated keys before
> + * others. The first cycle doesn't save unchanged tails.
> + */
> + stailq_foreach_entry(i, &field->map.items, in_items) {
> + if (i->key != NULL) {
> + out = mp_encode_str(out, i->key, i->key_len);
> + out += xrow_update_field_store(&i->field, out, out_end);
> + }
> + }
> + stailq_foreach_entry(i, &field->map.items, in_items) {
> + memcpy(out, i->field.data + i->field.size, i->tail_size);
> + out += i->tail_size;
> + }
> + assert(out <= out_end);
> + return out - out_begin;
> +}
> diff --git a/src/box/xrow_update_route.c b/src/box/xrow_update_route.c
> index f374db837..b76ebb36b 100644
> --- a/src/box/xrow_update_route.c
> +++ b/src/box/xrow_update_route.c
> @@ -123,6 +123,38 @@ xrow_update_route_branch_array(struct xrow_update_field *next_hop,
> return op->meta->do_op(op, next_hop);
> }
>
> +/**
> + * Do the actual branch, but by a map and a key in that map. Works
> + * exactly the same as the array-counterpart.
> + */
> +static int
> +xrow_update_route_branch_map(struct xrow_update_field *next_hop,
> + const char *parent,
> + const struct xrow_update_field *child,
> + const char *key, int key_len)
> +{
> + struct xrow_update_op *op = child->bar.op;
> + if (child->type != XUPDATE_BAR || child->bar.path_len > 0 ||
> + (op->opcode != '!' && op->opcode != '#')) {
> + return xrow_update_map_create_with_child(next_hop, parent,
> + child, key, key_len);
> + }
> + op->token_type = JSON_TOKEN_STR;
> + op->key = key;
> + op->key_len = key_len;
> + const char *data = parent;
> + uint32_t field_count = mp_decode_map(&data);
> + const char *end = data;
> + for (uint32_t i = 0; i < field_count; ++i) {
> + mp_next(&end);
> + mp_next(&end);
> + }
> + if (xrow_update_map_create(next_hop, parent, data, end,
> + field_count) != 0)
> + return -1;
> + return op->meta->do_op(op, next_hop);
> +}
> +
> struct xrow_update_field *
> xrow_update_route_branch(struct xrow_update_field *field,
> struct xrow_update_op *new_op)
> @@ -256,9 +288,18 @@ xrow_update_route_branch(struct xrow_update_field *field,
> old_token.num) != 0)
> return NULL;
> } else if (type == MP_MAP) {
> - diag_set(ClientError, ER_UNSUPPORTED, "update",
> - "path intersection on map");
> - return NULL;
> + if (new_token.type != JSON_TOKEN_STR) {
> + xrow_update_err(new_op, "can not update map by "\
> + "non-string key");
> + return NULL;
> + }
> + new_op->token_type = JSON_TOKEN_STR;
> + new_op->key = new_token.str;
> + new_op->key_len = new_token.len;
> + if (xrow_update_route_branch_map(next_hop, parent, &child,
> + old_token.str,
> + old_token.len) != 0)
> + return NULL;
> } else {
> xrow_update_err_no_such_field(new_op);
> return NULL;
> diff --git a/test/box/update.result b/test/box/update.result
> index bf76f6638..4f9349bce 100644
> --- a/test/box/update.result
> +++ b/test/box/update.result
> @@ -1471,6 +1471,158 @@ t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
> ---
> - error: Field ''[4][3][2]'' was not found in the tuple
> ...
> +--
> +-- Intersecting map updates.
> +--
> +t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
> +---
> +- [1, {}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14], 15, 16, [17, 18,
> + 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 2300, 'e': 2400}}, 25]]
> +...
> +-- Fill an empty map.
> +t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
> +---
> +- [1, {'k': 100, 'v': 200}, [], [1, 2, 3, [4, 5, 6, 7, [8, 9, [10, 11, 12], 13, 14],
> + 15, 16, [17, 18, 19]], 20, {'b': 22, 'a': 21, 'c': {'d': 23, 'e': 24}}, 25]]
> +...
> +t = {} \
> +t[1] = 1 \
> +t[2] = { \
> + a = 1, \
> + b = 2, \
> + c = {3, 4, 5}, \
> + d = { \
> + [0] = 0, \
> + e = 6, \
> + [true] = false, \
> + f = 7, \
> + [-1] = -1, \
> + }, \
> + g = { \
> + {k = 8, v = 9}, \
> + [box.NULL] = box.NULL, \
> + {k = 10, v = 11}, \
> + {k = '12str', v = '13str'}, \
> + }, \
> + h = { \
> + [-1] = {{{{-1}, -1}, -1}, -1}, \
> + i = { \
> + j = 14, \
> + k = 15, \
> + }, \
> + m = 16, \
> + } \
> +}
> +---
> +...
> +t[3] = {}
> +---
> +...
> +t = box.tuple.new(t)
> +---
> +...
> +-- Insert and delete from the same map at once.
> +t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
> +---
> +- [1, {'a': 1, 'n': 17, 'd': {0: 0, -1: -1, true: false, 'f': 7, 'e': 6}, 'h': {'i': {
> + 'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3, 4, 5], 'g': {
> + 1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
> + null: null}}, []]
> +...
> +-- Delete a not existing key.
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
> +---
> +- [1, {'b': 2, 'a': 1, 'd': {0: 0, 'f': 7, 'e': 6, -1: -1, true: false, 'g': 6000},
> + 'h': {'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'c': [3,
> + 4, 5], 'g': {1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
> + null: null}}, []]
> +...
> +-- Scalar operations.
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
> +---
> +- [1, {'b': 3, 'c': [3, 4, 5], 'd': {0: 0, true: false, 'f': 8, -1: -1, 'e': 7}, 'h': {
> + 'i': {'j': 14, 'k': 15}, -1: [[[[-1], -1], -1], -1], 'm': 16}, 'a': 1, 'g': {
> + 1: {'k': 8, 'v': 9}, 2: {'k': 10, 'v': 11}, 3: {'k': '12str', 'v': '13str'},
> + null: null}}, []]
> +...
> +-- Errors.
> +-- This operation array creates an update tree with several full
> +-- featured maps, not bars. It is required to cover some complex
> +-- cases about fields extraction.
> +ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
> +---
> +...
> +-- When a map update extracts a next token on demand, it should
> +-- reject bad json, obviously. Scalar and '!'/'#' operations are
> +-- implemented separately and tested both.
> +ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
> +...
> +ops[#ops][1] = '!'
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
> +...
> +ops[#ops][1] = '#'
> +---
> +...
> +ops[#ops][3] = 1
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i.<badjson>'' UPDATE error: invalid JSON in position 9'
> +...
> +-- Key extractor should reject any attempt to use non-string keys.
> +ops[#ops] = {'-', '[2].h.i[20]', -1}
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
> + key'
> +...
> +ops[#ops][1] = '='
> +---
> +...
> +t:update(ops)
> +---
> +- error: 'Field ''[2].h.i[20]'' UPDATE error: can''t update a map by not a string
> + key'
> +...
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
> +---
> +- error: Field ''[2].d.k'' was not found in the tuple
> +...
> +t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
> +---
> +- error: Field ''[2].d.g.h'' was not found in the tuple
> +...
> +t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
> +---
> +- error: 'Field ''[2].d.e'' UPDATE error: the key exists already'
> +...
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
> +---
> +- error: 'Field ''[2].d.old'' UPDATE error: can delete only 1 field from a map in
> + a row'
> +...
> +-- '!'/'=' can be used to create a field, but only if it is a
> +-- tail. These operations can't create the whole path.
> +t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
> +---
> +- error: Field ''[2].d.new.new'' was not found in the tuple
> +...
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
> +---
> +- error: Field ''[2].d.old.old'' was not found in the tuple
> +...
> s:drop()
> ---
> ...
> diff --git a/test/box/update.test.lua b/test/box/update.test.lua
> index d91a2e95c..6fed818f4 100644
> --- a/test/box/update.test.lua
> +++ b/test/box/update.test.lua
> @@ -515,4 +515,79 @@ t:update({{'!', '[4][3]', 2.5}, {'+', '[4][5][5][5]', 0.75}, {'+', '[4][5][3]',
> -- during branching.
> t:update({{'+', '[4][3]', 0.5}, {'+', '[4][3][2]', 0.5}})
>
> +--
> +-- Intersecting map updates.
> +--
> +t:update({{'=', '[4][6].c.d', 2300}, {'=', '[4][6].c.e', 2400}})
> +-- Fill an empty map.
> +t:update({{'=', '[2].k', 100}, {'=', '[2].v', 200}})
> +
> +t = {} \
> +t[1] = 1 \
> +t[2] = { \
> + a = 1, \
> + b = 2, \
> + c = {3, 4, 5}, \
> + d = { \
> + [0] = 0, \
> + e = 6, \
> + [true] = false, \
> + f = 7, \
> + [-1] = -1, \
> + }, \
> + g = { \
> + {k = 8, v = 9}, \
> + [box.NULL] = box.NULL, \
> + {k = 10, v = 11}, \
> + {k = '12str', v = '13str'}, \
> + }, \
> + h = { \
> + [-1] = {{{{-1}, -1}, -1}, -1}, \
> + i = { \
> + j = 14, \
> + k = 15, \
> + }, \
> + m = 16, \
> + } \
> +}
> +t[3] = {}
> +t = box.tuple.new(t)
> +-- Insert and delete from the same map at once.
> +t:update({{'!', '[2].n', 17}, {'#', '[2].b', 1}})
> +-- Delete a not existing key.
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 1}})
> +
> +-- Scalar operations.
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].b', 1}, {'+', '[2].d.f', 1}})
> +
> +-- Errors.
> +-- This operation array creates an update tree with several full
> +-- featured maps, not bars. It is required to cover some complex
> +-- cases about fields extraction.
> +ops = {{'=', '[2].h.i.j', 14000}, {'=', '[2].h.i.k', 15000}, {'=', '[2].h.m', 16000}, {'=', '[2].b', 2000}, {}}
> +-- When a map update extracts a next token on demand, it should
> +-- reject bad json, obviously. Scalar and '!'/'#' operations are
> +-- implemented separately and tested both.
> +ops[#ops] = {'+', '[2].h.i.<badjson>', -1}
> +t:update(ops)
> +ops[#ops][1] = '!'
> +t:update(ops)
> +ops[#ops][1] = '#'
> +ops[#ops][3] = 1
> +t:update(ops)
> +-- Key extractor should reject any attempt to use non-string keys.
> +ops[#ops] = {'-', '[2].h.i[20]', -1}
> +t:update(ops)
> +ops[#ops][1] = '='
> +t:update(ops)
> +
> +t:update({{'+', '[2].d.e', 1}, {'+', '[2].d.f', 1}, {'+', '[2].d.k', 1}})
> +t:update({{'=', '[2].d.e', 6000}, {'!', '[2].d.g.h', -1}})
> +t:update({{'!', '[2].d.g', 6000}, {'!', '[2].d.e', -1}})
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old', 10}})
> +-- '!'/'=' can be used to create a field, but only if it is a
> +-- tail. These operations can't create the whole path.
> +t:update({{'=', '[2].d.g', 6000}, {'=', '[2].d.new.new', -1}})
> +t:update({{'=', '[2].d.g', 6000}, {'#', '[2].d.old.old', 1}})
> +
> s:drop()
> --
> 2.21.0 (Apple Git-122.2)
>
More information about the Tarantool-patches
mailing list