Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@dev.tarantool.org
Subject: [Tarantool-patches] [PATCH 2/2] tuple: account the whole array in field.data and size
Date: Tue, 12 Nov 2019 00:10:48 +0100	[thread overview]
Message-ID: <1c71867c22c60c3f139d8ee9d29079ed3e874aeb.1573513733.git.v.shpilevoy@tarantool.org> (raw)
In-Reply-To: <cover.1573513733.git.v.shpilevoy@tarantool.org>

Before the patch a struct xrow_update_field object didn't account
array header in its .size and .data members. Indeed, it was not
needed, because anyway updates could be only 'flat'.
For example, consider the tuple:

    [mp_array, mp_uint, mp_uint, mp_uint]
              ^                         ^
             pos1                      pos2

Struct xrow_update_field.size and .data accounted memory from
pos1 to pos2, without the array header. Number of fields was
stored inside a rope object. This is why it made no sense to keep
array header pointer.

But now updates are going to be not flat, and not only for array.
There will be an update tree. Each node of that tree will describe
update of some part of a tuple.

Some of the nodes will need to know exact borders of their
children, including headers. It is going to be used for fast
copying of neighbours of such children. Consider an example.

Tuple with one field consisting of nested maps:

    tuple = {}
    tuple[1] = {
        a = {
            b = {
                c = {
                    d = {1, 2, 3}
                }
            }
        }
    }

Update:

    {{'+', '[1].a.b.c.d[1]', 1}, {'+', '[1].a.b.c.d[2]', 1}}

To update such a tuple a simple tree will be built:

            root: [ [1] ]
                     |
 isolated path: [ 'a.b.c' ]
                     |
      leaves: [ [1] [2] [3] ]
                +1  +1   -

Root node keeps the whole tuple borders. It is a rope with single
field.
This single field is a deeply updated map. Such deep multiple
updates with long common prefixes are stored as an isolated path
+ map/array in the end. Here the isolated path is 'a.b.c'. It
ends with the terminal array update.

Assume, that operations are applied and it is time to save the
result. Save starts from the root.
Root rope will encode root array header, and will try to save the
single field. The single field is an isolated update. It needs to
save everything before old {1,2,3}, the new array {2,2,3}, and
everything after the old array. The simplest way to do it - know
exact borders of the old array {1,2,3} and memcpy all memory
before and after.

This is exactly what this patch allows to do. Everything before
xrow_update_field.data, and after xrow_update_field.data + .size
can be safely copied, and is not related to the field. To copy
adjacent memory it is not even needed to know field type.
Xrow_update_field.data and .size have the same meaning for all
field types.

Part of #1261
---
 src/box/xrow_update.c       | 28 ++++++++++++++++------------
 src/box/xrow_update_array.c |  9 +++++----
 src/box/xrow_update_field.h |  6 ++++--
 3 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/src/box/xrow_update.c b/src/box/xrow_update.c
index bb98b30ad..123db081a 100644
--- a/src/box/xrow_update.c
+++ b/src/box/xrow_update.c
@@ -269,11 +269,12 @@ xrow_update_read_ops(struct xrow_update *update, const char *expr,
  * @retval -1 Error.
  */
 static int
-xrow_update_do_ops(struct xrow_update *update, const char *old_data,
-		   const char *old_data_end, uint32_t part_count)
+xrow_update_do_ops(struct xrow_update *update, const char *header,
+		   const char *old_data, const char *old_data_end,
+		   uint32_t part_count)
 {
-	if (xrow_update_array_create(&update->root, old_data, old_data_end,
-				     part_count) != 0)
+	if (xrow_update_array_create(&update->root, header, old_data,
+				     old_data_end, part_count) != 0)
 		return -1;
 	struct xrow_update_op *op = update->ops;
 	struct xrow_update_op *ops_end = op + update->op_count;
@@ -290,12 +291,12 @@ xrow_update_do_ops(struct xrow_update *update, const char *old_data,
  *        and it is enough to simply write the error to the log.
  */
 static int
-xrow_upsert_do_ops(struct xrow_update *update, const char *old_data,
-		   const char *old_data_end, uint32_t part_count,
-		   bool suppress_error)
+xrow_upsert_do_ops(struct xrow_update *update, const char *header,
+		   const char *old_data, const char *old_data_end,
+		   uint32_t part_count, bool suppress_error)
 {
-	if (xrow_update_array_create(&update->root, old_data, old_data_end,
-				     part_count) != 0)
+	if (xrow_update_array_create(&update->root, header, old_data,
+				     old_data_end, part_count) != 0)
 		return -1;
 	struct xrow_update_op *op = update->ops;
 	struct xrow_update_op *ops_end = op + update->op_count;
@@ -352,12 +353,14 @@ xrow_update_execute(const char *expr,const char *expr_end,
 {
 	struct xrow_update update;
 	xrow_update_init(&update, index_base);
+	const char *header = old_data;
 	uint32_t field_count = mp_decode_array(&old_data);
 
 	if (xrow_update_read_ops(&update, expr, expr_end, dict,
 				 field_count) != 0)
 		return NULL;
-	if (xrow_update_do_ops(&update, old_data, old_data_end, field_count))
+	if (xrow_update_do_ops(&update, header, old_data, old_data_end,
+			       field_count) != 0)
 		return NULL;
 	if (column_mask)
 		*column_mask = update.column_mask;
@@ -373,13 +376,14 @@ xrow_upsert_execute(const char *expr,const char *expr_end,
 {
 	struct xrow_update update;
 	xrow_update_init(&update, index_base);
+	const char *header = old_data;
 	uint32_t field_count = mp_decode_array(&old_data);
 
 	if (xrow_update_read_ops(&update, expr, expr_end, dict,
 				 field_count) != 0)
 		return NULL;
-	if (xrow_upsert_do_ops(&update, old_data, old_data_end, field_count,
-			       suppress_error))
+	if (xrow_upsert_do_ops(&update, header, old_data, old_data_end,
+			       field_count, suppress_error) != 0)
 		return NULL;
 	if (column_mask)
 		*column_mask = update.column_mask;
diff --git a/src/box/xrow_update_array.c b/src/box/xrow_update_array.c
index b5f443cd0..7f198076b 100644
--- a/src/box/xrow_update_array.c
+++ b/src/box/xrow_update_array.c
@@ -142,12 +142,13 @@ xrow_update_array_extract_item(struct xrow_update_field *field,
 }
 
 int
-xrow_update_array_create(struct xrow_update_field *field, const char *data,
-			 const char *data_end, uint32_t field_count)
+xrow_update_array_create(struct xrow_update_field *field, const char *header,
+			 const char *data, const char *data_end,
+			 uint32_t field_count)
 {
 	field->type = XUPDATE_ARRAY;
-	field->data = data;
-	field->size = data_end - data;
+	field->data = header;
+	field->size = data_end - header;
 	struct region *region = &fiber()->gc;
 	field->array.rope = xrow_update_rope_new(region);
 	if (field->array.rope == NULL)
diff --git a/src/box/xrow_update_field.h b/src/box/xrow_update_field.h
index 04e452d23..e90095b9e 100644
--- a/src/box/xrow_update_field.h
+++ b/src/box/xrow_update_field.h
@@ -334,6 +334,7 @@ xrow_update_##type##_store(struct xrow_update_field *field, char *out,		\
 /**
  * Initialize @a field as an array to update.
  * @param[out] field Field to initialize.
+ * @param header Header of the MessagePack array @a data.
  * @param data MessagePack data of the array to update.
  * @param data_end End of @a data.
  * @param field_count Field count in @data.
@@ -342,8 +343,9 @@ xrow_update_##type##_store(struct xrow_update_field *field, char *out,		\
  * @retval -1 Error.
  */
 int
-xrow_update_array_create(struct xrow_update_field *field, const char *data,
-			 const char *data_end, uint32_t field_count);
+xrow_update_array_create(struct xrow_update_field *field, const char *header,
+			 const char *data, const char *data_end,
+			 uint32_t field_count);
 
 OP_DECL_GENERIC(array)
 
-- 
2.21.0 (Apple Git-122.2)

  parent reply	other threads:[~2019-11-11 23:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-11 23:10 [Tarantool-patches] [PATCH 0/2] JSON preparation part 6 Vladislav Shpilevoy
2019-11-11 23:10 ` [Tarantool-patches] [PATCH 1/2] json: lexer_eof and token_cmp helper functions Vladislav Shpilevoy
2019-11-11 23:10 ` Vladislav Shpilevoy [this message]
2019-11-12 10:01 ` [Tarantool-patches] [PATCH 0/2] JSON preparation part 6 Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c71867c22c60c3f139d8ee9d29079ed3e874aeb.1573513733.git.v.shpilevoy@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 2/2] tuple: account the whole array in field.data and size' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox