[tarantool-patches] Re: [PATCH v2 2/8] tuple: rework updates to improve code extendibility

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sat Sep 28 01:33:12 MSK 2019


Hi! Thanks for the review!

On 03/09/2019 21:20, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/09/01 00:33]:
> 
> If you create a directory, please make it a standalone library, so
> that dependencies are controlled, both at compile-time and at
> link-time.

I tried - it is not possible. Update depends on tuple_dictionary and
TUPLE_INDEX_BASE from tuple_format.h. Updates need to be a part of
the tuple library.

> I don't see a reason to keep the top-level update api in src/box, 
> please move it to update/update.h or alike.
I don't see a reason not to keep it, but I don't mind to move
all to update/.

=====================================================================

diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 62737aafe..df5c686dd 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -40,7 +40,7 @@ add_library(tuple STATIC
     tuple.c
     field_map.c
     tuple_format.c
-    tuple_update.c
+    update/update.c
     update/update_field.c
     update/update_array.c
     tuple_compare.cc
diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 3902288bf..7ec58b9d3 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -29,7 +29,7 @@
  * SUCH DAMAGE.
  */
 #include "box/lua/tuple.h"
-#include "box/tuple_update.h"
+#include "box/update/update.h"
 
 #include "lua/utils.h" /* luaT_error() */
 #include "lua/msgpack.h" /* luamp_encode_XXX() */
diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c
index 487cfdadd..04640bab9 100644
--- a/src/box/memtx_space.c
+++ b/src/box/memtx_space.c
@@ -33,7 +33,7 @@
 #include "iproto_constants.h"
 #include "txn.h"
 #include "tuple.h"
-#include "tuple_update.h"
+#include "update/update.h"
 #include "xrow.h"
 #include "memtx_hash.h"
 #include "memtx_tree.h"
diff --git a/src/box/space.c b/src/box/space.c
index 042be042c..2c7a71de6 100644
--- a/src/box/space.c
+++ b/src/box/space.c
@@ -39,7 +39,7 @@
 #include "session.h"
 #include "txn.h"
 #include "tuple.h"
-#include "tuple_update.h"
+#include "update/update.h"
 #include "request.h"
 #include "xrow.h"
 #include "iproto_constants.h"
diff --git a/src/box/tuple.c b/src/box/tuple.c
index 261505f9a..fb398f08e 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -36,7 +36,7 @@
 #include "small/quota.h"
 #include "small/small.h"
 
-#include "tuple_update.h"
+#include "update/update.h"
 #include "coll_id_cache.h"
 
 static struct mempool tuple_iterator_pool;
diff --git a/src/box/tuple.h b/src/box/tuple.h
index ab0c7a99b..6d2bf5261 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -1115,7 +1115,7 @@ tuple_to_buf(struct tuple *tuple, char *buf, size_t size);
 #if defined(__cplusplus)
 } /* extern "C" */
 
-#include "tuple_update.h"
+#include "update/update.h"
 #include "errinj.h"
 
 /* @copydoc tuple_field_with_type() */
diff --git a/src/box/tuple_update.c b/src/box/update/update.c
similarity index 99%
rename from src/box/tuple_update.c
rename to src/box/update/update.c
index 36dc7a50d..a57a2eb60 100644
--- a/src/box/tuple_update.c
+++ b/src/box/update/update.c
@@ -28,13 +28,13 @@
  * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
-#include "tuple_update.h"
-#include "error.h"
+#include "update.h"
+#include "box/error.h"
 #include "diag.h"
 #include <msgpuck/msgpuck.h>
-#include "column_mask.h"
+#include "box/column_mask.h"
 #include "fiber.h"
-#include "update/update_field.h"
+#include "update_field.h"
 
 /**
  * UPDATE is represented by a sequence of operations, each
diff --git a/src/box/tuple_update.h b/src/box/update/update.h
similarity index 95%
rename from src/box/tuple_update.h
rename to src/box/update/update.h
index b6210dd38..f1798341c 100644
--- a/src/box/tuple_update.h
+++ b/src/box/update/update.h
@@ -1,5 +1,5 @@
-#ifndef TARANTOOL_BOX_TUPLE_UPDATE_H_INCLUDED
-#define TARANTOOL_BOX_TUPLE_UPDATE_H_INCLUDED
+#ifndef TARANTOOL_BOX_UPDATE_UPDATE_H_INCLUDED
+#define TARANTOOL_BOX_UPDATE_UPDATE_H_INCLUDED
 /*
  * Copyright 2010-2016, Tarantool AUTHORS, please see AUTHORS file.
  *
@@ -83,4 +83,4 @@ tuple_upsert_squash(const char *expr1, const char *expr1_end,
 } /* extern "C" */
 #endif /* defined(__cplusplus) */
 
-#endif /* TARANTOOL_BOX_TUPLE_UPDATE_H_INCLUDED */
+#endif /* TARANTOOL_BOX_UPDATE_UPDATE_H_INCLUDED */
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 23910795f..41cb29e07 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -58,7 +58,7 @@
 #include "coio_task.h"
 #include "cbus.h"
 #include "histogram.h"
-#include "tuple_update.h"
+#include "update/update.h"
 #include "txn.h"
 #include "xrow.h"
 #include "xlog.h"
diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
index 817d29cfd..0abaeea54 100644
--- a/src/box/vy_upsert.c
+++ b/src/box/vy_upsert.c
@@ -34,7 +34,7 @@
 #include <small/region.h>
 #include <msgpuck/msgpuck.h>
 #include "vy_stmt.h"
-#include "tuple_update.h"
+#include "update/update.h"
 #include "fiber.h"
 #include "column_mask.h"
 
diff --git a/test/unit/column_mask.c b/test/unit/column_mask.c
index 5ee8b7332..23c323bd4 100644
--- a/test/unit/column_mask.c
+++ b/test/unit/column_mask.c
@@ -1,5 +1,5 @@
 #include "column_mask.h"
-#include "tuple_update.h"
+#include "update/update.h"
 #include "unit.h"
 #include "msgpuck.h"
 #include "trivia/util.h"

=====================================================================

> 
> Generally I don't (I guess yet?) get why you make the split, I
> hope to learn in a subsequent patches.

Because update implementations for each field type are very different
and independent from each other. And I decided that it would be simpler
to maintain several relatively small and simple files for each field
type, than one huge mega-file for all updates. Actually, I borrowed
that file organization from some other databases, where I've looked
for an inspiration. Such as mongo.

> 
> Judging by the change in signatures of error messages (%u -> %s), 
> you changed the error messages a bit. 
> 
> I would not be able to review this change, since you move the code
> around. I trust you however that it is trivial.

Yes, you were right. There were some changes which I did just being
engrossed with the huge code movements. I extracted them into separate
commits and we already pushed them to the master.

> 
> Sigh, moving code screws up the commit history, are you sure its
> really necessary...

Yes. I tried to keep one file, and it appeared to be unmaintainable.
It is not like iproto.cc, where almost the entire file is one piece of
interlinked code. Here most of the code parts are too big and too
independent to be kept in one file.

>> -		if (op->meta->do_op(update, op))
>> +		if (op->meta->do_cb(op, &update->root_array) != 0)
> 
> These are noise changes. I am generally fine with them, even
> though I don't like this particular one, but they
> are trivial and should be committed separately.
> 
> In particular, do_cb si a meaningless name. When you invoke a
> function, you do it. If you invoke a callback, you do a callback.
> This is like function "function".
> Besides, I don't think metatable functions are callbacks. They are a C way
> of handling polymorphism. A callback is something that you pass
> into another function as an inverse dependency, it is usually
> devoid of state.
> 

Ok, agree. I returned do_op, read_arg, store methods.

>> +	if (update_array_create(&update->root_array, old_data, old_data_end,
>> +				part_count) != 0)
> 
> root_array is also a bad name IMHO. first of all, there is no
> point in calling an array variable "array", so, perhaps,
> update->root would be just as good. But this suggests that this is
> a root node in some tree-like data structure. But then why is this
> data structure called update_array? We had the very same problem
> in json parser, and there we called the structure json_tree_node.
> 

Ok, I don't see much difference here, so I just changed the name
to 'root'.

It is a tree-like structure, but I don't think such a tree has an
official name. Nodes of that tree can be arrays, maps. Leaves can be
anything: array/map/scalar/bar. This is why the root node was called
'root_array'.

=====================================================================

diff --git a/src/box/update/update.c b/src/box/update/update.c
index a57a2eb60..564f1b75c 100644
--- a/src/box/update/update.c
+++ b/src/box/update/update.c
@@ -115,7 +115,7 @@ struct tuple_update {
 	/** A bitmask of all columns modified by this update. */
 	uint64_t column_mask;
 	/** First level of update tree. It is always array. */
-	struct update_field root_array;
+	struct update_field root;
 };
 
 /**
@@ -267,13 +267,13 @@ static int
 update_do_ops(struct tuple_update *update, const char *old_data,
 	      const char *old_data_end, uint32_t part_count)
 {
-	if (update_array_create(&update->root_array, old_data, old_data_end,
+	if (update_array_create(&update->root, old_data, old_data_end,
 				part_count) != 0)
 		return -1;
 	struct update_op *op = update->ops;
 	struct update_op *ops_end = op + update->op_count;
 	for (; op < ops_end; op++) {
-		if (op->meta->do_op(op, &update->root_array) != 0)
+		if (op->meta->do_op(op, &update->root) != 0)
 			return -1;
 	}
 	return 0;
@@ -289,13 +289,13 @@ upsert_do_ops(struct tuple_update *update, const char *old_data,
 	      const char *old_data_end, uint32_t part_count,
 	      bool suppress_error)
 {
-	if (update_array_create(&update->root_array, old_data, old_data_end,
+	if (update_array_create(&update->root, old_data, old_data_end,
 				part_count) != 0)
 		return -1;
 	struct update_op *op = update->ops;
 	struct update_op *ops_end = op + update->op_count;
 	for (; op < ops_end; op++) {
-		if (op->meta->do_op(op, &update->root_array) == 0)
+		if (op->meta->do_op(op, &update->root) == 0)
 			continue;
 		struct error *e = diag_last_error(diag_get());
 		if (e->type != &type_ClientError)
@@ -318,13 +318,13 @@ update_init(struct tuple_update *update, int index_base)
 static const char *
 update_finish(struct tuple_update *update, uint32_t *p_tuple_len)
 {
-	uint32_t tuple_len = update_array_sizeof(&update->root_array);
+	uint32_t tuple_len = update_array_sizeof(&update->root);
 	char *buffer = (char *) region_alloc(&fiber()->gc, tuple_len);
 	if (buffer == NULL) {
 		diag_set(OutOfMemory, tuple_len, "region_alloc", "buffer");
 		return NULL;
 	}
-	*p_tuple_len = update_array_store(&update->root_array, buffer,
+	*p_tuple_len = update_array_store(&update->root, buffer,
 					  buffer + tuple_len);
 	assert(*p_tuple_len == tuple_len);
 	return buffer;

=====================================================================

>>  		return NULL;
>> -	*p_tuple_len = update_write_tuple(update, buffer, buffer + tuple_len);
>> -	return buffer;
>> +	}
>> +	*p_tuple_len = update_array_store(&update->root_array, out,
>> +					  out + tuple_size);
> 
> Seems like you renaming things at random. Why do you think
> update_write_tuple() is worse than update_array_store?

Firstly why I changed 'tuple' to 'array'.
Because tuple is array, and besides array there are now
more types. 'Tuple' as a field type looks odd, comparing
to map, scalar, etc.

Secondly, why I changed 'write' to 'store'. Because we
already have 'store' method of struct update_op, and I
decided to use one name.

Thirdly, why I used 'array_store' instead of 'store_array'.
Because I wanted these methods:

    update_array_store
    update_map_store
    update_bar_store
    update_route_store
    ..._store

be methods of their field types: update of array, update of
map, update of bar, etc. And I decided to use the naming
policy:

    <update_type>_<action>.

> 
> I'm fine with a justified rename, but I don't get the idea behind
> the new name , and I don't think it makes the naming easier to
> understand.

It is debatable. The reasoning above looks fair to me. But we can
discuss.




More information about the Tarantool-patches mailing list