Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
@ 2020-10-03 13:28 Nikita Pettik
  2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 1/2] " Nikita Pettik
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Nikita Pettik @ 2020-10-03 13:28 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

Issues:
https://github.com/tarantool/tarantool/issues/1622
https://github.com/tarantool/tarantool/issues/5105
https://github.com/tarantool/tarantool/issues/5092
https://github.com/tarantool/tarantool/issues/5107
Branch:
https://github.com/tarantool/tarantool/tree/np/gh-5107-dont-squash-ops

@ChangeLog:
 - Rework upsert operation in vinyl so that now (gh-5107):
   - if upsert can't be applied it is skipped and corresponding error is logged (gh-1622);
   - upserts now follow associative property: result of several upserts
     doesn't depend on the order of their application (gh-5105);
   - upserts referring to -1 fieldno are handled correctly now (gh-5087).
   - there's no more upserts squash procedure: upserts referring to the
     same field with arithmetic operations are not merged into one 
     operation since resulting upsert might not be applied - as a result
     both upserts would be ignored (meanwhile only one should be).

Changes in v3:
 - Removed upsert squashing procedure;
 - Added two additional test covering overflow error during upsert
   application;
 - Fixed NULL dereference bug in vy_apply_upsert_on_terminal_stmt().

Nikita Pettik (2):
  vinyl: rework upsert operation
  vinyl: remove squash procedures from source code

 src/box/vinyl.c                 |   2 +-
 src/box/vy_stmt.c               |  28 +-
 src/box/vy_stmt.h               |   5 +-
 src/box/vy_upsert.c             | 302 +++++++++++---------
 src/box/xrow_update.c           | 144 ----------
 src/box/xrow_update.h           |  14 -
 test/unit/vy_iterators_helper.c |   2 +-
 test/vinyl/upsert.result        | 473 ++++++++++++++++++++++++++++++++
 test/vinyl/upsert.test.lua      | 194 +++++++++++++
 9 files changed, 867 insertions(+), 297 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-03 13:28 [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
@ 2020-10-03 13:28 ` Nikita Pettik
  2020-10-06 22:12   ` Vladislav Shpilevoy
  2020-10-13 19:00   ` Aleksandr Lyapunov
  2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 2/2] vinyl: remove squash procedures from source code Nikita Pettik
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 19+ messages in thread
From: Nikita Pettik @ 2020-10-03 13:28 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

Previous upsert implementation had a few drawbacks which led to number
of various bugs and issues.

Issue #5092 (redundant update operations execution)

In a nutshell, application of upsert(s) (on top of another upsert)
consists of two actions (see vy_apply_upsert()): execute and squash.
Consider example:

insert({1, 1})  -- terminal statement, stored on disk
upsert({1}, {{'-', 2, 20}}) -- old ups1
upsert({1}, {{'+', 2, 10}}) -- new ups2

'Execute' step takes update operations from the new upsert and combines them
with key of the old upsert.  {1} + {'+', 2, 10} can't be evaluated since
key consists of only one field. Note that in case upsert doesn't fold
into insert the upsert's tuple and the tuple stored in index can be
different In our particular case, tuple stored on disk has two fields
({1, 1}), so first upsert's update operation can be applied to it:
{1, 1} + {'+', 2, 10} --> {1, 11}. If upsert's operation can't be executed
using key of old upsert, we simply continue processing squash step.
In turn 'squash' is a combination of update operations: arithmetic
operations are combined so we don't have to store actions over the same
field; the rest operations - are merged into single array. As a result,
we get one upsert with squashed operations: upsert({1}, {{'+', 2, -10}}).
Then vy_apply_upsert() is called again to apply new upsert on the top of
terminal statement - insert{1, 1}. Since now tuple has second field,
update operations can be executed and corresponding result is {1, -9}.
It is the final result of upsert application procedure.
Now imagine that we have following upserts:

upsert({1, 1}, {{'-', 2, 20}}) -- old ups1
upsert({1}, {{'+', 2, 10}}) -- new ups2

In this case execution successfully finishes and modifies old upsert's
tuple: {1, 1} + {'+', 2, 10} --> {1, 11}
However, we still have to squash/accumulate update operations since they
may be applied on tuple stored on disk later. After all, we have
following upsert: upsert({2, 11}, {{'+', 2, -10}}). Then it is applied
on the top of insert({1, 1}) and we get the same result as in the first
case - {1, -9}. The only difference is that upsert's tuple was modified.
As one can see, execution of update operations applied to upsert's tuple
is redundant in the case index already contains tuple with the same key
(i.e. when upserts turns into update). Instead, we are able to
accumulate/squash update operations only. When the last upsert is being
applied, we can either execute all update operation on tuple fetched
from index (i.e. upsert is update) OR on tuple specified in the first
upsert (i.e. first upsert is insert).

Issue #5105 (upsert doesn't follow associative property)

Secondly, current approach breaks associative property: after upsert's
update operations are merged into one array, part of them (related to
one upsert) can be skipped, meanwhile the rest - is applied. For
instance:

-- Index is over second field.
i = s:create_index('pk', {parts={2, 'uint'}})
s:replace{1, 2, 3, 'default'}
s:upsert({2, 2, 2}, {{'=', 4, 'upserted'}})
-- First update operation modifies primary key, so upsert must be ignored.
s:upsert({2, 2, 2}, {{'#', 1, 1}, {'!', 3, 1}})

After merging two upserts we get the next one:
upsert({2, 2, 2}, {{'=', 4, 'upserted'}, {'#', 1, 1}, {'!', 3, 1}}

While we executing update operations, we don't distinguish operations from
different upserts. Thus, if one operation fails, the rest are ignored
as well. As a result, first (in general case - all preceding squashed
upserts) upsert won't be applied, even despite the fact it is
absolutely correct. What is more, user gets no error/warning concerning
this fact.

Issue #1622 (no upsert result validation)

After upsert application, there's no check verifying that result
satisfies space's format: number of fields, their types, overflows etc.
Due to this tuples violating format may appear in the space, which in
turn may lead to unpredictable consequences.

To resolve these issues, let's group update operations of each upsert into
separate array. So that operations related to particular upsert are
stored in single array. In terms of previous example we will get:
upsert({2, 2, 2}, {{{'=', 4, 'upserted'}}, {{'#', 1, 1}, {'!', 3, 1}}}

Also note that we don't longer have to apply update operations on tuple
in vy_apply_upsert() when it comes for two upserts: it can be done once we
face terminal statement; or if there's no underlying statement (i.e. it is
delete statement or no statement at all) we apply all update arrays except
the first one on upsert's tuple. In case one of operations from array
fail, we skip the rest operations from this array and process to the
next array. After successful application of update operations of each
array, we check that the resulting tuple fits into space format. If they
aren't, we rollback applied operations, log error and moving to the next
group of operations.

Finally, arithmetic operations are not longer able to be combined: it is
requirement which is arises from #5105 issue.  Otherwise, result of
upserts combination might turn out to be inapplicable to the tuple
stored on disk (e.g. result applied on tuple leads to integer overflow -
in this case only last upsert leading to overflow must be ignored).

Closes #1622
Closes #5105
Closes #5092
Part of #5107
---
 src/box/vinyl.c                 |   2 +-
 src/box/vy_stmt.c               |  28 +-
 src/box/vy_stmt.h               |   5 +-
 src/box/vy_upsert.c             | 302 +++++++++++---------
 test/unit/vy_iterators_helper.c |   2 +-
 test/vinyl/upsert.result        | 473 ++++++++++++++++++++++++++++++++
 test/vinyl/upsert.test.lua      | 194 +++++++++++++
 7 files changed, 867 insertions(+), 139 deletions(-)

diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index cee39c58c..9d9ee0613 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -1976,7 +1976,7 @@ vy_lsm_upsert(struct vy_tx *tx, struct vy_lsm *lsm,
 	operations[0].iov_base = (void *)expr;
 	operations[0].iov_len = expr_end - expr;
 	vystmt = vy_stmt_new_upsert(lsm->mem_format, tuple, tuple_end,
-				    operations, 1);
+				    operations, 1, false);
 	if (vystmt == NULL)
 		return -1;
 	assert(vy_stmt_type(vystmt) == IPROTO_UPSERT);
diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c
index 92e0aa1c5..2ee82b3f2 100644
--- a/src/box/vy_stmt.c
+++ b/src/box/vy_stmt.c
@@ -313,16 +313,22 @@ vy_key_dup(const char *key)
 static struct tuple *
 vy_stmt_new_with_ops(struct tuple_format *format, const char *tuple_begin,
 		     const char *tuple_end, struct iovec *ops,
-		     int op_count, enum iproto_type type)
+		     int op_count, enum iproto_type type, bool is_ops_encoded)
 {
 	mp_tuple_assert(tuple_begin, tuple_end);
 
 	const char *tmp = tuple_begin;
 	mp_decode_array(&tmp);
 
+	/*
+	 * ops are grouped in one extra array.
+	 * See vy_apply_upsert() for details.
+	 */
 	size_t ops_size = 0;
 	for (int i = 0; i < op_count; ++i)
 		ops_size += ops[i].iov_len;
+	if (!is_ops_encoded)
+		ops_size += mp_sizeof_array(op_count);
 
 	struct tuple *stmt = NULL;
 	struct region *region = &fiber()->gc;
@@ -360,6 +366,8 @@ vy_stmt_new_with_ops(struct tuple_format *format, const char *tuple_begin,
 	field_map_build(&builder, wpos - field_map_size);
 	memcpy(wpos, tuple_begin, mpsize);
 	wpos += mpsize;
+	if (!is_ops_encoded)
+		wpos = mp_encode_array(wpos, op_count);
 	for (struct iovec *op = ops, *end = ops + op_count;
 	     op != end; ++op) {
 		memcpy(wpos, op->iov_base, op->iov_len);
@@ -374,10 +382,11 @@ end:
 struct tuple *
 vy_stmt_new_upsert(struct tuple_format *format, const char *tuple_begin,
 		   const char *tuple_end, struct iovec *operations,
-		   uint32_t ops_cnt)
+		   uint32_t ops_cnt, bool is_ops_encoded)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    operations, ops_cnt, IPROTO_UPSERT);
+				    operations, ops_cnt, IPROTO_UPSERT,
+				    is_ops_encoded);
 }
 
 struct tuple *
@@ -385,7 +394,7 @@ vy_stmt_new_replace(struct tuple_format *format, const char *tuple_begin,
 		    const char *tuple_end)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    NULL, 0, IPROTO_REPLACE);
+				    NULL, 0, IPROTO_REPLACE, true);
 }
 
 struct tuple *
@@ -393,7 +402,7 @@ vy_stmt_new_insert(struct tuple_format *format, const char *tuple_begin,
 		   const char *tuple_end)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    NULL, 0, IPROTO_INSERT);
+				    NULL, 0, IPROTO_INSERT, true);
 }
 
 struct tuple *
@@ -401,7 +410,7 @@ vy_stmt_new_delete(struct tuple_format *format, const char *tuple_begin,
 		   const char *tuple_end)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    NULL, 0, IPROTO_DELETE);
+				    NULL, 0, IPROTO_DELETE, true);
 }
 
 struct tuple *
@@ -735,19 +744,20 @@ vy_stmt_decode(struct xrow_header *xrow, struct tuple_format *format)
 		/* Always use key format for DELETE statements. */
 		stmt = vy_stmt_new_with_ops(env->key_format,
 					    request.key, request.key_end,
-					    NULL, 0, IPROTO_DELETE);
+					    NULL, 0, IPROTO_DELETE, true);
 		break;
 	case IPROTO_INSERT:
 	case IPROTO_REPLACE:
 		stmt = vy_stmt_new_with_ops(format, request.tuple,
 					    request.tuple_end,
-					    NULL, 0, request.type);
+					    NULL, 0, request.type, true);
 		break;
 	case IPROTO_UPSERT:
 		ops.iov_base = (char *)request.ops;
 		ops.iov_len = request.ops_end - request.ops;
 		stmt = vy_stmt_new_upsert(format, request.tuple,
-					  request.tuple_end, &ops, 1);
+					  request.tuple_end, &ops,
+					  1, true);
 		break;
 	default:
 		/* TODO: report filename. */
diff --git a/src/box/vy_stmt.h b/src/box/vy_stmt.h
index 25219230d..65acbecac 100644
--- a/src/box/vy_stmt.h
+++ b/src/box/vy_stmt.h
@@ -528,6 +528,8 @@ vy_stmt_new_delete(struct tuple_format *format, const char *tuple_begin,
  * @param part_count Part count from key definition.
  * @param operations Vector of update operations.
  * @param ops_cnt Length of the update operations vector.
+ * @param is_ops_encoded True, if update operations are already packed
+  *                      into extra msgpack array.
  *
  * @retval NULL     Memory allocation error.
  * @retval not NULL Success.
@@ -535,7 +537,8 @@ vy_stmt_new_delete(struct tuple_format *format, const char *tuple_begin,
 struct tuple *
 vy_stmt_new_upsert(struct tuple_format *format,
 		   const char *tuple_begin, const char *tuple_end,
-		   struct iovec *operations, uint32_t ops_cnt);
+		   struct iovec *operations, uint32_t ops_cnt,
+		   bool is_ops_encoded);
 
 /**
  * Create REPLACE statement from UPSERT statement.
diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
index e697b6321..2b5a16730 100644
--- a/src/box/vy_upsert.c
+++ b/src/box/vy_upsert.c
@@ -39,39 +39,150 @@
 #include "column_mask.h"
 
 /**
- * Try to squash two upsert series (msgspacked index_base + ops)
- * Try to create a tuple with squahed operations
+ * Check that key hasn't been changed after applying upsert operation.
+ */
+static bool
+vy_apply_result_does_cross_pk(struct tuple *old_stmt, const char *result,
+			      const char *result_end, struct key_def *cmp_def,
+			      uint64_t col_mask)
+{
+	if (!key_update_can_be_skipped(cmp_def->column_mask, col_mask)) {
+		struct tuple *tuple =
+			vy_stmt_new_replace(tuple_format(old_stmt), result,
+					    result_end);
+		int cmp_res = vy_stmt_compare(old_stmt, HINT_NONE, tuple,
+					      HINT_NONE, cmp_def);
+		tuple_unref(tuple);
+		return cmp_res != 0;
+	}
+	return false;
+}
+
+/**
+ * Apply update operations stored in @a upsert on tuple @a stmt. If @a stmt is
+ * void statement (i.e. it is NULL or delete statement) then operations are
+ * applied on tuple stored in @a upsert. Update operations of @a upsert which
+ * can't be applied are skipped along side with other operations from single
+ * group (i.e. packed in one msgpack array); errors may be logged depending on
+ * @a suppress_error flag.
  *
- * @retval 0 && *result_stmt != NULL : successful squash
- * @retval 0 && *result_stmt == NULL : unsquashable sources
- * @retval -1 - memory error
+ * @param upsert Upsert statement to be applied on @a stmt.
+ * @param stmt Statement to be used as base for upsert operations.
+ * @param cmp_def Key definition required to provide check of primary key
+ *                modification.
+ * @return Tuple containing result of upsert application; NULL in case OOM.
  */
-static int
-vy_upsert_try_to_squash(struct tuple_format *format,
-			const char *key_mp, const char *key_mp_end,
-			const char *old_serie, const char *old_serie_end,
-			const char *new_serie, const char *new_serie_end,
-			struct tuple **result_stmt)
+static struct tuple *
+vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
+				 struct key_def *cmp_def, bool suppress_error)
 {
-	*result_stmt = NULL;
+	assert(vy_stmt_type(upsert) == IPROTO_UPSERT);
+	assert(stmt == NULL || vy_stmt_type(stmt) != IPROTO_UPSERT);
 
-	size_t squashed_size;
-	const char *squashed =
-		xrow_upsert_squash(old_serie, old_serie_end,
-				   new_serie, new_serie_end, format,
-				   &squashed_size);
-	if (squashed == NULL)
-		return 0;
-	/* Successful squash! */
-	struct iovec operations[1];
-	operations[0].iov_base = (void *)squashed;
-	operations[0].iov_len = squashed_size;
+	uint32_t mp_size;
+	const char *new_ops = vy_stmt_upsert_ops(upsert, &mp_size);
+	/* Msgpack containing result of upserts application. */
+	const char *result_mp;
+	bool stmt_is_void = stmt == NULL || vy_stmt_type(stmt) == IPROTO_DELETE;
+	if (stmt_is_void)
+		result_mp = vy_upsert_data_range(upsert, &mp_size);
+	else
+		result_mp = tuple_data_range(stmt, &mp_size);
+	const char *result_mp_end = result_mp + mp_size;
+	/*
+	 * xrow_upsert_execute() allocates result using region,
+	 * so save starting point to release it later.
+	 */
+	struct region *region = &fiber()->gc;
+	size_t region_svp = region_used(region);
+	uint64_t column_mask = COLUMN_MASK_FULL;
+	struct tuple_format *format = tuple_format(upsert);
+
+	uint32_t ups_cnt = mp_decode_array(&new_ops);
+	const char *ups_ops = new_ops;
+	/*
+	 * In case upsert folds into insert, we must skip first
+	 * update operations.
+	 */
+	if (stmt_is_void) {
+		ups_cnt--;
+		mp_next(&ups_ops);
+	}
+	for (uint32_t i = 0; i < ups_cnt; ++i) {
+		assert(mp_typeof(*ups_ops) == MP_ARRAY);
+		const char *ups_ops_end = ups_ops;
+		mp_next(&ups_ops_end);
+		const char *exec_res = result_mp;
+		exec_res = xrow_upsert_execute(ups_ops, ups_ops_end, result_mp,
+					       result_mp_end, format, &mp_size,
+					       0, suppress_error, &column_mask);
+		if (exec_res == NULL) {
+			if (! suppress_error) {
+				struct error *e = diag_last_error(diag_get());
+				assert(e != NULL);
+				/* Bail out immediately in case of OOM. */
+				if (e->type != &type_ClientError) {
+					region_truncate(region, region_svp);
+					return NULL;
+				}
+				diag_log();
+			}
+			ups_ops = ups_ops_end;
+			continue;
+		}
+		/*
+		 * If it turns out that resulting tuple modifies primary
+		 * key, then simply ignore this upsert.
+		 */
+		if (stmt != NULL &&
+		    vy_apply_result_does_cross_pk(stmt, exec_res,
+						  exec_res + mp_size, cmp_def,
+						  column_mask)) {
+			if (!suppress_error) {
+				say_error("upsert operations %s are not applied"\
+					  " due to primary key modification",
+					  mp_str(ups_ops));
+			}
+			ups_ops = ups_ops_end;
+			continue;
+		}
+		ups_ops = ups_ops_end;
+		/*
+		 * Result statement must satisfy space's format. Since upsert's
+		 * tuple correctness is already checked in vy_upsert(), let's
+		 * use its format to provide result verification.
+		 */
+		struct tuple_format *format = tuple_format(upsert);
+		if (tuple_validate_raw(format, exec_res) != 0) {
+			if (! suppress_error)
+				diag_log();
+			continue;
+		}
+		result_mp = exec_res;
+		result_mp_end = exec_res + mp_size;
+	}
+	struct tuple *new_terminal_stmt = vy_stmt_new_replace(format, result_mp,
+							      result_mp_end);
+	region_truncate(region, region_svp);
+	if (new_terminal_stmt == NULL)
+		return NULL;
+	vy_stmt_set_lsn(new_terminal_stmt, vy_stmt_lsn(upsert));
+	return new_terminal_stmt;
+}
 
-	*result_stmt = vy_stmt_new_upsert(format, key_mp, key_mp_end,
-					  operations, 1);
-	if (*result_stmt == NULL)
-		return -1;
-	return 0;
+/**
+ * Unpack upsert's update operations from msgpack array
+ * into array of iovecs.
+ */
+static void
+upsert_ops_to_iovec(const char *ops, uint32_t ops_cnt, struct iovec *iov_arr)
+{
+	for (uint32_t i = 0; i < ops_cnt; ++i) {
+		assert(mp_typeof(*ops) == MP_ARRAY);
+		iov_arr[i].iov_base = (char *) ops;
+		mp_next(&ops);
+		iov_arr[i].iov_len = ops - (char *) iov_arr[i].iov_base;
+	}
 }
 
 struct tuple *
@@ -87,122 +198,59 @@ vy_apply_upsert(struct tuple *new_stmt, struct tuple *old_stmt,
 	assert(new_stmt != old_stmt);
 	assert(vy_stmt_type(new_stmt) == IPROTO_UPSERT);
 
-	if (old_stmt == NULL || vy_stmt_type(old_stmt) == IPROTO_DELETE) {
-		/*
-		 * INSERT case: return new stmt.
-		 */
-		return vy_stmt_replace_from_upsert(new_stmt);
+	struct tuple *result_stmt = NULL;
+	if (old_stmt == NULL || vy_stmt_type(old_stmt) != IPROTO_UPSERT) {
+		return vy_apply_upsert_on_terminal_stmt(new_stmt, old_stmt,
+						        cmp_def, suppress_error);
 	}
 
-	struct tuple_format *format = tuple_format(new_stmt);
-
+	assert(old_stmt != NULL);
+	assert(vy_stmt_type(old_stmt) == IPROTO_UPSERT);
 	/*
-	 * Unpack UPSERT operation from the new stmt
+	 * Unpack UPSERT operation from the old and new stmts.
 	 */
 	uint32_t mp_size;
-	const char *new_ops;
-	new_ops = vy_stmt_upsert_ops(new_stmt, &mp_size);
-	const char *new_ops_end = new_ops + mp_size;
-
+	const char *old_ops = vy_stmt_upsert_ops(old_stmt, &mp_size);
+	const char *old_stmt_mp = vy_upsert_data_range(old_stmt, &mp_size);
+	const char *old_stmt_mp_end = old_stmt_mp + mp_size;
+	const char *new_ops = vy_stmt_upsert_ops(new_stmt, &mp_size);
 	/*
-	 * Apply new operations to the old stmt
+	 * UPSERT + UPSERT case: unpack operations to iovec array and merge
+	 * them into one ops array.
 	 */
-	const char *result_mp;
-	if (vy_stmt_type(old_stmt) == IPROTO_UPSERT)
-		result_mp = vy_upsert_data_range(old_stmt, &mp_size);
-	else
-		result_mp = tuple_data_range(old_stmt, &mp_size);
-	const char *result_mp_end = result_mp + mp_size;
-	struct tuple *result_stmt = NULL;
+	struct tuple_format *format = tuple_format(old_stmt);
 	struct region *region = &fiber()->gc;
 	size_t region_svp = region_used(region);
-	uint8_t old_type = vy_stmt_type(old_stmt);
-	uint64_t column_mask = COLUMN_MASK_FULL;
-	result_mp = xrow_upsert_execute(new_ops, new_ops_end, result_mp,
-					result_mp_end, format, &mp_size,
-					0, suppress_error, &column_mask);
-	if (result_mp == NULL) {
+	uint32_t old_ops_cnt = mp_decode_array(&old_ops);
+	uint32_t new_ops_cnt = mp_decode_array(&new_ops);
+	size_t ops_size;
+	struct iovec *operations =
+		region_alloc_array(region, typeof(operations[0]),
+				   old_ops_cnt + new_ops_cnt, &ops_size);
+	if (operations == NULL) {
 		region_truncate(region, region_svp);
+		diag_set(OutOfMemory, ops_size, "region_alloc_array",
+			 "operations");
 		return NULL;
 	}
-	result_mp_end = result_mp + mp_size;
-	if (old_type != IPROTO_UPSERT) {
-		assert(old_type == IPROTO_INSERT ||
-		       old_type == IPROTO_REPLACE);
-		/*
-		 * UPDATE case: return the updated old stmt.
-		 */
-		result_stmt = vy_stmt_new_replace(format, result_mp,
-						  result_mp_end);
-		region_truncate(region, region_svp);
-		if (result_stmt == NULL)
-			return NULL; /* OOM */
-		vy_stmt_set_lsn(result_stmt, vy_stmt_lsn(new_stmt));
-		goto check_key;
-	}
-
-	/*
-	 * Unpack UPSERT operation from the old stmt
-	 */
-	assert(old_stmt != NULL);
-	const char *old_ops;
-	old_ops = vy_stmt_upsert_ops(old_stmt, &mp_size);
-	const char *old_ops_end = old_ops + mp_size;
-	assert(old_ops_end > old_ops);
-
+	upsert_ops_to_iovec(old_ops, old_ops_cnt, operations);
+	upsert_ops_to_iovec(new_ops, new_ops_cnt, &operations[old_ops_cnt]);
 	/*
-	 * UPSERT + UPSERT case: combine operations
+	 * Adding update operations. We keep order of update operations in
+	 * the array the same. It is vital since first set of operations
+	 * must be skipped in case upsert folds into insert. For instance:
+	 * old_ops = {{{op1}, {op2}}, {{op3}}}
+	 * new_ops = {{{op4}, {op5}}}
+	 * res_ops = {{{op1}, {op2}}, {{op3}}, {{op4}, {op5}}}
+	 * If upsert corresponding to old_ops becomes insert, then
+	 * {{op1}, {op2}} update operations are not applied.
 	 */
-	assert(old_ops_end - old_ops > 0);
-	if (vy_upsert_try_to_squash(format, result_mp, result_mp_end,
-				    old_ops, old_ops_end, new_ops, new_ops_end,
-				    &result_stmt) != 0) {
-		region_truncate(region, region_svp);
-		return NULL;
-	}
-	if (result_stmt != NULL) {
-		region_truncate(region, region_svp);
-		vy_stmt_set_lsn(result_stmt, vy_stmt_lsn(new_stmt));
-		goto check_key;
-	}
-
-	/* Failed to squash, simply add one upsert to another */
-	int old_ops_cnt, new_ops_cnt;
-	struct iovec operations[3];
-
-	old_ops_cnt = mp_decode_array(&old_ops);
-	operations[1].iov_base = (void *)old_ops;
-	operations[1].iov_len = old_ops_end - old_ops;
-
-	new_ops_cnt = mp_decode_array(&new_ops);
-	operations[2].iov_base = (void *)new_ops;
-	operations[2].iov_len = new_ops_end - new_ops;
-
-	char ops_buf[16];
-	char *header = mp_encode_array(ops_buf, old_ops_cnt + new_ops_cnt);
-	operations[0].iov_base = (void *)ops_buf;
-	operations[0].iov_len = header - ops_buf;
-
-	result_stmt = vy_stmt_new_upsert(format, result_mp, result_mp_end,
-					 operations, 3);
+	result_stmt = vy_stmt_new_upsert(format, old_stmt_mp, old_stmt_mp_end,
+					 operations, old_ops_cnt + new_ops_cnt,
+					 false);
 	region_truncate(region, region_svp);
 	if (result_stmt == NULL)
 		return NULL;
 	vy_stmt_set_lsn(result_stmt, vy_stmt_lsn(new_stmt));
-
-check_key:
-	/*
-	 * Check that key hasn't been changed after applying operations.
-	 */
-	if (!key_update_can_be_skipped(cmp_def->column_mask, column_mask) &&
-	    vy_stmt_compare(old_stmt, HINT_NONE, result_stmt,
-			    HINT_NONE, cmp_def) != 0) {
-		/*
-		 * Key has been changed: ignore this UPSERT and
-		 * @retval the old stmt.
-		 */
-		tuple_unref(result_stmt);
-		result_stmt = vy_stmt_dup(old_stmt);
-	}
 	return result_stmt;
 }
diff --git a/test/unit/vy_iterators_helper.c b/test/unit/vy_iterators_helper.c
index 0d20f19ef..15470920b 100644
--- a/test/unit/vy_iterators_helper.c
+++ b/test/unit/vy_iterators_helper.c
@@ -112,7 +112,7 @@ vy_new_simple_stmt(struct tuple_format *format, struct key_def *key_def,
 			ops = mp_encode_int(ops, templ->upsert_value);
 		operations[0].iov_base = tmp;
 		operations[0].iov_len = ops - tmp;
-		ret = vy_stmt_new_upsert(format, buf, pos, operations, 1);
+		ret = vy_stmt_new_upsert(format, buf, pos, operations, 1, true);
 		fail_if(ret == NULL);
 		break;
 	}
diff --git a/test/vinyl/upsert.result b/test/vinyl/upsert.result
index 3a7f6629d..a66e87be0 100644
--- a/test/vinyl/upsert.result
+++ b/test/vinyl/upsert.result
@@ -899,3 +899,476 @@ s:select()
 s:drop()
 ---
 ...
+-- gh-5107: don't squash upsert operations into one array.
+--
+-- gh-5087: test upsert execution/squash referring to fields in reversed
+-- order (via negative indexing).
+--
+s = box.schema.create_space('test', {engine = 'vinyl'})
+---
+...
+pk = s:create_index('pk')
+---
+...
+s:insert({1, 1, 1})
+---
+- [1, 1, 1]
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1}, {{'=', 3, 100}})
+---
+...
+s:upsert({1}, {{'=', -1, 200}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select() -- {1, 1, 200}
+---
+- - [1, 1, 200]
+...
+s:delete({1})
+---
+...
+s:insert({1, 1, 1})
+---
+- [1, 1, 1]
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1}, {{'=', -3, 100}})
+---
+...
+s:upsert({1}, {{'=', -1, 200}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+-- gh-5105: Two upserts are NOT squashed into one, so only one (first one)
+-- is skipped, meanwhile second one is applied.
+--
+s:select() -- {1, 1, 1}
+---
+- - [1, 1, 200]
+...
+s:delete({1})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1, 1}, {{'=', -2, 300}}) -- {1, 1}
+---
+...
+s:upsert({1}, {{'+', -1, 100}}) -- {1, 101}
+---
+...
+s:upsert({1}, {{'-', 2, 100}}) -- {1, 1}
+---
+...
+s:upsert({1}, {{'+', -1, 200}}) -- {1, 201}
+---
+...
+s:upsert({1}, {{'-', 2, 200}}) -- {1, 1}
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select() -- {1, 1}
+---
+- - [1, 1]
+...
+s:delete({1})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1, 1, 1}, {{'!', -1, 300}}) -- {1, 1, 1}
+---
+...
+s:upsert({1}, {{'+', -2, 100}}) -- {1, 101, 1}
+---
+...
+s:upsert({1}, {{'=', -1, 100}}) -- {1, 101, 100}
+---
+...
+s:upsert({1}, {{'+', -1, 200}}) -- {1, 101, 300}
+---
+...
+s:upsert({1}, {{'-', -2, 100}}) -- {1, 1, 300}
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 1, 300]
+...
+s:drop()
+---
+...
+-- gh-1622: upsert operations which break space format are not applied.
+--
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+---
+...
+pk = s:create_index('pk')
+---
+...
+s:replace{1, 1}
+---
+- [1, 1]
+...
+-- Error is logged, upsert is not applied.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+---
+...
+-- During read the incorrect upsert is ignored.
+--
+s:select{}
+---
+- - [1, 1]
+...
+-- Try to set incorrect field_count in a transaction.
+--
+box.begin()
+---
+...
+s:replace{2, 2}
+---
+- [2, 2]
+...
+s:upsert({2, 2}, {{'=', 3, 2}})
+---
+...
+s:select{}
+---
+- - [1, 1]
+  - [2, 2]
+...
+box.commit()
+---
+...
+s:select{}
+---
+- - [1, 1]
+  - [2, 2]
+...
+-- Read incorrect upsert from a run: it should be ignored.
+--
+box.snapshot()
+---
+- ok
+...
+s:select{}
+---
+- - [1, 1]
+  - [2, 2]
+...
+s:upsert({2, 2}, {{'=', 3, 20}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select{}
+---
+- - [1, 1]
+  - [2, 2]
+...
+-- Execute replace/delete after invalid upsert.
+--
+box.snapshot()
+---
+- ok
+...
+s:upsert({2, 2}, {{'=', 3, 30}})
+---
+...
+s:replace{2, 3}
+---
+- [2, 3]
+...
+s:select{}
+---
+- - [1, 1]
+  - [2, 3]
+...
+s:upsert({1, 1}, {{'=', 3, 30}})
+---
+...
+s:delete{1}
+---
+...
+s:select{}
+---
+- - [2, 3]
+...
+-- Invalid upsert in a sequence of upserts is skipped meanwhile
+-- the rest are applied.
+--
+box.snapshot()
+---
+- ok
+...
+s:upsert({2, 2}, {{'+', 2, 5}})
+---
+...
+s:upsert({2, 2}, {{'=', 3, 40}})
+---
+...
+s:upsert({2, 2}, {{'+', 2, 5}})
+---
+...
+s:select{}
+---
+- - [2, 13]
+...
+box.snapshot()
+---
+- ok
+...
+s:select{}
+---
+- - [2, 13]
+...
+s:drop()
+---
+...
+-- Test different scenarious during which update operations squash can't
+-- take place due to format violations.
+--
+decimal = require('decimal')
+---
+...
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 5 })
+---
+...
+s:format({{name='id', type='unsigned'}, {name='u', type='unsigned'},\
+          {name='s', type='scalar'}, {name='f', type='double'},\
+          {name='d', type='decimal'}})
+---
+...
+pk = s:create_index('pk')
+---
+...
+s:replace{1, 1, 1, 1.1, decimal.new(1.1) }
+---
+- [1, 1, 1, 1.1, 1.1]
+...
+s:replace{2, 1, 1, 1.1, decimal.new(1.1)}
+---
+- [2, 1, 1, 1.1, 1.1]
+...
+box.snapshot()
+---
+- ok
+...
+-- Can't assign integer to float field. First operation is still applied.
+--
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 4, 4}})
+---
+...
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'=', 4, 4}})
+---
+...
+-- Can't add floating point to integer (result is floating point).
+--
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, 5}})
+---
+...
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, 5.5}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 1, 1, 5.1, 1.1]
+  - [2, 6, 1, 1.1, 1.1]
+...
+-- Integer overflow check.
+--
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 3, 9223372036854775808}})
+---
+...
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 3, 9223372036854775808}})
+---
+...
+-- Negative result of subtraction stored in unsigned field.
+--
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, 2}})
+---
+...
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'-', 2, 10}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 1, 9223372036854775809, 5.1, 1.1]
+  - [2, 8, 1, 1.1, 1.1]
+...
+-- Decimals do not fit into numerics and vice versa.
+--
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 5, 2}})
+---
+...
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'-', 5, 1}})
+---
+...
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, decimal.new(2.1)}})
+---
+...
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'-', 2, decimal.new(1.2)}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 1, 9223372036854775809, 5.1, 2.1]
+  - [2, 8, 1, 1.1, 1.1]
+...
+s:drop()
+---
+...
+-- Upserts leading to overflow are ignored.
+--
+format = {}
+---
+...
+format[1] = {name = 'f1', type = 'unsigned'}
+---
+...
+format[2] = {name = 'f2', type = 'unsigned'}
+---
+...
+s = box.schema.space.create('test', {engine = 'vinyl', format = format})
+---
+...
+_ = s:create_index('pk')
+---
+...
+uint_max = 18446744073709551615ULL
+---
+...
+s:replace{1, uint_max - 2}
+---
+- [1, 18446744073709551613]
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1, 0}, {{'+', 2, 1}})
+---
+...
+s:upsert({1, 0}, {{'+', 2, 1}})
+---
+...
+s:upsert({1, 0}, {{'+', 2, 1}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 18446744073709551615]
+...
+s:delete{1}
+---
+...
+s:replace{1, uint_max - 2, 0}
+---
+- [1, 18446744073709551613, 0]
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1, 0, 0}, {{'+', 2, 1}})
+---
+...
+s:upsert({1, 0, 0}, {{'+', 2, 1}})
+---
+...
+s:upsert({1, 0, 0}, {{'+', 2, 0.5}})
+---
+...
+s:upsert({1, 0, 0}, {{'+', 2, 1}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 18446744073709551615, 0]
+...
+s:drop()
+---
+...
+-- Make sure upserts satisfy associativity rule.
+--
+s = box.schema.space.create('test', {engine='vinyl'})
+---
+...
+i = s:create_index('pk', {parts={2, 'uint'}})
+---
+...
+s:replace{1, 2, 3, 'default'}
+---
+- [1, 2, 3, 'default']
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({2, 2, 2}, {{'=', 4, 'upserted'}})
+---
+...
+-- Upsert will fail and thus ignored.
+--
+s:upsert({2, 2, 2}, {{'#', 1, 1}, {'!', 3, 1}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select{}
+---
+- - [1, 2, 3, 'upserted']
+...
+s:drop()
+---
+...
diff --git a/test/vinyl/upsert.test.lua b/test/vinyl/upsert.test.lua
index 1d77474da..ad69f4b72 100644
--- a/test/vinyl/upsert.test.lua
+++ b/test/vinyl/upsert.test.lua
@@ -372,3 +372,197 @@ box.snapshot()
 s:select()
 
 s:drop()
+
+-- gh-5107: don't squash upsert operations into one array.
+--
+-- gh-5087: test upsert execution/squash referring to fields in reversed
+-- order (via negative indexing).
+--
+s = box.schema.create_space('test', {engine = 'vinyl'})
+pk = s:create_index('pk')
+s:insert({1, 1, 1})
+box.snapshot()
+
+s:upsert({1}, {{'=', 3, 100}})
+s:upsert({1}, {{'=', -1, 200}})
+box.snapshot()
+s:select() -- {1, 1, 200}
+
+s:delete({1})
+s:insert({1, 1, 1})
+box.snapshot()
+
+s:upsert({1}, {{'=', -3, 100}})
+s:upsert({1}, {{'=', -1, 200}})
+box.snapshot()
+-- gh-5105: Two upserts are NOT squashed into one, so only one (first one)
+-- is skipped, meanwhile second one is applied.
+--
+s:select() -- {1, 1, 1}
+
+s:delete({1})
+box.snapshot()
+
+s:upsert({1, 1}, {{'=', -2, 300}}) -- {1, 1}
+s:upsert({1}, {{'+', -1, 100}}) -- {1, 101}
+s:upsert({1}, {{'-', 2, 100}}) -- {1, 1}
+s:upsert({1}, {{'+', -1, 200}}) -- {1, 201}
+s:upsert({1}, {{'-', 2, 200}}) -- {1, 1}
+box.snapshot()
+s:select() -- {1, 1}
+
+s:delete({1})
+box.snapshot()
+
+s:upsert({1, 1, 1}, {{'!', -1, 300}}) -- {1, 1, 1}
+s:upsert({1}, {{'+', -2, 100}}) -- {1, 101, 1}
+s:upsert({1}, {{'=', -1, 100}}) -- {1, 101, 100}
+s:upsert({1}, {{'+', -1, 200}}) -- {1, 101, 300}
+s:upsert({1}, {{'-', -2, 100}}) -- {1, 1, 300}
+box.snapshot()
+s:select()
+
+s:drop()
+
+-- gh-1622: upsert operations which break space format are not applied.
+--
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+pk = s:create_index('pk')
+s:replace{1, 1}
+-- Error is logged, upsert is not applied.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+-- During read the incorrect upsert is ignored.
+--
+s:select{}
+
+-- Try to set incorrect field_count in a transaction.
+--
+box.begin()
+s:replace{2, 2}
+s:upsert({2, 2}, {{'=', 3, 2}})
+s:select{}
+box.commit()
+s:select{}
+
+-- Read incorrect upsert from a run: it should be ignored.
+--
+box.snapshot()
+s:select{}
+s:upsert({2, 2}, {{'=', 3, 20}})
+box.snapshot()
+s:select{}
+
+-- Execute replace/delete after invalid upsert.
+--
+box.snapshot()
+s:upsert({2, 2}, {{'=', 3, 30}})
+s:replace{2, 3}
+s:select{}
+
+s:upsert({1, 1}, {{'=', 3, 30}})
+s:delete{1}
+s:select{}
+
+-- Invalid upsert in a sequence of upserts is skipped meanwhile
+-- the rest are applied.
+--
+box.snapshot()
+s:upsert({2, 2}, {{'+', 2, 5}})
+s:upsert({2, 2}, {{'=', 3, 40}})
+s:upsert({2, 2}, {{'+', 2, 5}})
+s:select{}
+box.snapshot()
+s:select{}
+
+s:drop()
+
+-- Test different scenarious during which update operations squash can't
+-- take place due to format violations.
+--
+decimal = require('decimal')
+
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 5 })
+s:format({{name='id', type='unsigned'}, {name='u', type='unsigned'},\
+          {name='s', type='scalar'}, {name='f', type='double'},\
+          {name='d', type='decimal'}})
+pk = s:create_index('pk')
+s:replace{1, 1, 1, 1.1, decimal.new(1.1) }
+s:replace{2, 1, 1, 1.1, decimal.new(1.1)}
+box.snapshot()
+-- Can't assign integer to float field. First operation is still applied.
+--
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 4, 4}})
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'=', 4, 4}})
+-- Can't add floating point to integer (result is floating point).
+--
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, 5}})
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, 5.5}})
+box.snapshot()
+s:select()
+-- Integer overflow check.
+--
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 3, 9223372036854775808}})
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 3, 9223372036854775808}})
+-- Negative result of subtraction stored in unsigned field.
+--
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, 2}})
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'-', 2, 10}})
+box.snapshot()
+s:select()
+-- Decimals do not fit into numerics and vice versa.
+--
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 5, 2}})
+s:upsert({1, 1, 1, 2.5, decimal.new(1.1)}, {{'-', 5, 1}})
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'+', 2, decimal.new(2.1)}})
+s:upsert({2, 1, 1, 2.5, decimal.new(1.1)}, {{'-', 2, decimal.new(1.2)}})
+box.snapshot()
+s:select()
+
+s:drop()
+
+-- Upserts leading to overflow are ignored.
+--
+format = {}
+format[1] = {name = 'f1', type = 'unsigned'}
+format[2] = {name = 'f2', type = 'unsigned'}
+s = box.schema.space.create('test', {engine = 'vinyl', format = format})
+_ = s:create_index('pk')
+uint_max = 18446744073709551615ULL
+s:replace{1, uint_max - 2}
+box.snapshot()
+
+s:upsert({1, 0}, {{'+', 2, 1}})
+s:upsert({1, 0}, {{'+', 2, 1}})
+s:upsert({1, 0}, {{'+', 2, 1}})
+box.snapshot()
+s:select()
+
+s:delete{1}
+s:replace{1, uint_max - 2, 0}
+box.snapshot()
+
+s:upsert({1, 0, 0}, {{'+', 2, 1}})
+s:upsert({1, 0, 0}, {{'+', 2, 1}})
+s:upsert({1, 0, 0}, {{'+', 2, 0.5}})
+s:upsert({1, 0, 0}, {{'+', 2, 1}})
+box.snapshot()
+s:select()
+s:drop()
+
+-- Make sure upserts satisfy associativity rule.
+--
+s = box.schema.space.create('test', {engine='vinyl'})
+i = s:create_index('pk', {parts={2, 'uint'}})
+s:replace{1, 2, 3, 'default'}
+box.snapshot()
+
+s:upsert({2, 2, 2}, {{'=', 4, 'upserted'}})
+-- Upsert will fail and thus ignored.
+--
+s:upsert({2, 2, 2}, {{'#', 1, 1}, {'!', 3, 1}})
+box.snapshot()
+
+s:select{}
+
+s:drop()
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Tarantool-patches] [PATCH v3 2/2] vinyl: remove squash procedures from source code
  2020-10-03 13:28 [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
  2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 1/2] " Nikita Pettik
@ 2020-10-03 13:28 ` Nikita Pettik
  2020-10-03 13:52 ` [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Nikita Pettik @ 2020-10-03 13:28 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

After previous commit, there's no need in these functions. However,
someday we may want to reconsider it squash optimization and make it
work without breaking upsert associativity rule. So to not lost initial
squash implementation, let's put its removal in a separate patch.

Follow-up #5107
---
 src/box/xrow_update.c | 144 ------------------------------------------
 src/box/xrow_update.h |  14 ----
 2 files changed, 158 deletions(-)

diff --git a/src/box/xrow_update.c b/src/box/xrow_update.c
index 833975f03..cba182ab2 100644
--- a/src/box/xrow_update.c
+++ b/src/box/xrow_update.c
@@ -415,147 +415,3 @@ xrow_upsert_execute(const char *expr,const char *expr_end,
 
 	return xrow_update_finish(&update, format, p_tuple_len);
 }
-
-const char *
-xrow_upsert_squash(const char *expr1, const char *expr1_end,
-		   const char *expr2, const char *expr2_end,
-		   struct tuple_format *format, size_t *result_size)
-{
-	const char *expr[2] = {expr1, expr2};
-	const char *expr_end[2] = {expr1_end, expr2_end};
-	struct xrow_update update[2];
-	struct tuple_dictionary *dict = format->dict;
-	for (int j = 0; j < 2; j++) {
-		xrow_update_init(&update[j], 0);
-		if (xrow_update_read_ops(&update[j], expr[j], expr_end[j],
-					 dict, 0) != 0)
-			return NULL;
-		mp_decode_array(&expr[j]);
-		int32_t prev_field_no = -1;
-		for (uint32_t i = 0; i < update[j].op_count; i++) {
-			struct xrow_update_op *op = &update[j].ops[i];
-			if (op->opcode != '+' && op->opcode != '-' &&
-			    op->opcode != '=')
-				return NULL;
-			/*
-			 * Not terminal operation means, that the
-			 * update is not flat, and squash would
-			 * need to build a tree of operations to
-			 * find matches. That is too complex,
-			 * squash is skipped.
-			 */
-			if (!xrow_update_op_is_term(op))
-				return NULL;
-			if (op->field_no <= prev_field_no)
-				return NULL;
-			prev_field_no = op->field_no;
-		}
-	}
-	size_t possible_size = expr1_end - expr1 + expr2_end - expr2;
-	const uint32_t space_for_arr_tag = 5;
-	char *buf = (char *) region_alloc(&fiber()->gc,
-					  possible_size + space_for_arr_tag);
-	if (buf == NULL) {
-		diag_set(OutOfMemory, possible_size + space_for_arr_tag,
-			 "region_alloc", "buf");
-		return NULL;
-	}
-	/* reserve some space for mp array header */
-	char *res_ops = buf + space_for_arr_tag;
-	uint32_t res_count = 0; /* number of resulting operations */
-
-	uint32_t op_count[2] = {update[0].op_count, update[1].op_count};
-	uint32_t op_no[2] = {0, 0};
-	struct json_tree *format_tree = &format->fields;
-	struct json_token *root = &format_tree->root;
-	struct json_token token;
-	token.type = JSON_TOKEN_NUM;
-	while (op_no[0] < op_count[0] || op_no[1] < op_count[1]) {
-		res_count++;
-		struct xrow_update_op *op[2] = {update[0].ops + op_no[0],
-						update[1].ops + op_no[1]};
-		/*
-		 * from:
-		 * 0 - take op from first update,
-		 * 1 - take op from second update,
-		 * 2 - merge both ops
-		 */
-		uint32_t from;
-		uint32_t has[2] = {op_no[0] < op_count[0], op_no[1] < op_count[1]};
-		assert(has[0] || has[1]);
-		if (has[0] && has[1]) {
-			from = op[0]->field_no < op[1]->field_no ? 0 :
-			       op[0]->field_no > op[1]->field_no ? 1 : 2;
-		} else {
-			assert(has[0] != has[1]);
-			from = has[1];
-		}
-		if (from == 2 && op[1]->opcode == '=') {
-			/*
-			 * If an operation from the second upsert is '='
-			 * it is just overwrites any op from the first upsert.
-			 * So we just skip op from the first upsert and
-			 * copy op from the second
-			 */
-			mp_next(&expr[0]);
-			op_no[0]++;
-			from = 1;
-		}
-		if (from < 2) {
-			/* take op from one of upserts */
-			const char *copy = expr[from];
-			mp_next(&expr[from]);
-			size_t copy_size = expr[from] - copy;
-			memcpy(res_ops, copy, copy_size);
-			res_ops += copy_size;
-			op_no[from]++;
-			continue;
-		}
-		/* merge: apply second '+' or '-' */
-		assert(op[1]->opcode == '+' || op[1]->opcode == '-');
-		if (op[0]->opcode == '-') {
-			op[0]->opcode = '+';
-			int96_invert(&op[0]->arg.arith.int96);
-		}
-		struct xrow_update_arg_arith arith;
-		/*
-		 * If we apply + or - on top of the set operation ('='),
-		 * then resulting operation is {'=', op1.val + op2.val}.
-		 * To continue processing arithmetic operation we should
-		 * firstly extract value from set operation holder.
-		 */
-		if (op[0]->opcode == '=') {
-			if (xrow_mp_read_arg_arith(op[0], &op[0]->arg.set.value,
-					           &arith) != 0)
-				return NULL;
-		} else {
-			arith = op[0]->arg.arith;
-		}
-		struct xrow_update_op res;
-		if (xrow_update_arith_make(op[1], arith,
-					   &res.arg.arith) != 0)
-			return NULL;
-		res_ops = mp_encode_array(res_ops, 3);
-		res_ops = mp_encode_str(res_ops,
-					(const char *)&op[0]->opcode, 1);
-		token.num = op[0]->field_no;
-		res_ops = mp_encode_uint(res_ops, token.num +
-					 update[0].index_base);
-		struct json_token *this_node =
-			json_tree_lookup(format_tree, root, &token);
-		xrow_update_op_store_arith(&res, format_tree, this_node, NULL,
-					   res_ops);
-		res_ops += xrow_update_arg_arith_sizeof(&res.arg.arith);
-		mp_next(&expr[0]);
-		mp_next(&expr[1]);
-		op_no[0]++;
-		op_no[1]++;
-	}
-	assert(op_no[0] == op_count[0] && op_no[1] == op_count[1]);
-	assert(expr[0] == expr_end[0] && expr[1] == expr_end[1]);
-	char *arr_start = buf + space_for_arr_tag -
-		mp_sizeof_array(res_count);
-	mp_encode_array(arr_start, res_count);
-	*result_size = res_ops - arr_start;
-	return arr_start;
-}
diff --git a/src/box/xrow_update.h b/src/box/xrow_update.h
index ba4abb1b0..64b4cd034 100644
--- a/src/box/xrow_update.h
+++ b/src/box/xrow_update.h
@@ -63,20 +63,6 @@ xrow_upsert_execute(const char *expr, const char *expr_end,
 		    int index_base, bool suppress_error,
 		    uint64_t *column_mask);
 
-/**
- * Try to merge two update/upsert expressions to an equivalent one.
- * Due to optimization reasons resulting expression
- * is located inside a bigger allocation. There also some hidden
- * internal allocations are made in this function.
- * Thus the only allocator that can be used in this function
- * is region allocator.
- * If it isn't possible to merge expressions NULL is returned.
- */
-const char *
-xrow_upsert_squash(const char *expr1, const char *expr1_end,
-		   const char *expr2, const char *expr2_end,
-		   struct tuple_format *format, size_t *result_size);
-
 #if defined(__cplusplus)
 } /* extern "C" */
 #endif /* defined(__cplusplus) */
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-03 13:28 [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
  2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 1/2] " Nikita Pettik
  2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 2/2] vinyl: remove squash procedures from source code Nikita Pettik
@ 2020-10-03 13:52 ` Nikita Pettik
  2020-10-06 22:12 ` Vladislav Shpilevoy
  2020-10-11 15:35 ` Vladislav Shpilevoy
  4 siblings, 0 replies; 19+ messages in thread
From: Nikita Pettik @ 2020-10-03 13:52 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

On 03 Oct 16:28, Nikita Pettik wrote:
> Issues:
> https://github.com/tarantool/tarantool/issues/1622
> https://github.com/tarantool/tarantool/issues/5105
> https://github.com/tarantool/tarantool/issues/5092
> https://github.com/tarantool/tarantool/issues/5107
> Branch:
> https://github.com/tarantool/tarantool/tree/np/gh-5107-dont-squash-ops

Sorry guys, the real branch is https://github.com/tarantool/tarantool/tree/np/gh-5107-dont-squash-and-merge-ops
 
> @ChangeLog:
>  - Rework upsert operation in vinyl so that now (gh-5107):
>    - if upsert can't be applied it is skipped and corresponding error is logged (gh-1622);
>    - upserts now follow associative property: result of several upserts
>      doesn't depend on the order of their application (gh-5105);
>    - upserts referring to -1 fieldno are handled correctly now (gh-5087).
>    - there's no more upserts squash procedure: upserts referring to the
>      same field with arithmetic operations are not merged into one 
>      operation since resulting upsert might not be applied - as a result
>      both upserts would be ignored (meanwhile only one should be).
> 
> Changes in v3:
>  - Removed upsert squashing procedure;
>  - Added two additional test covering overflow error during upsert
>    application;
>  - Fixed NULL dereference bug in vy_apply_upsert_on_terminal_stmt().
> 
> Nikita Pettik (2):
>   vinyl: rework upsert operation
>   vinyl: remove squash procedures from source code
> 
>  src/box/vinyl.c                 |   2 +-
>  src/box/vy_stmt.c               |  28 +-
>  src/box/vy_stmt.h               |   5 +-
>  src/box/vy_upsert.c             | 302 +++++++++++---------
>  src/box/xrow_update.c           | 144 ----------
>  src/box/xrow_update.h           |  14 -
>  test/unit/vy_iterators_helper.c |   2 +-
>  test/vinyl/upsert.result        | 473 ++++++++++++++++++++++++++++++++
>  test/vinyl/upsert.test.lua      | 194 +++++++++++++
>  9 files changed, 867 insertions(+), 297 deletions(-)
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-03 13:28 [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
                   ` (2 preceding siblings ...)
  2020-10-03 13:52 ` [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
@ 2020-10-06 22:12 ` Vladislav Shpilevoy
  2020-10-09 15:10   ` Nikita Pettik
  2020-10-11 15:35 ` Vladislav Shpilevoy
  4 siblings, 1 reply; 19+ messages in thread
From: Vladislav Shpilevoy @ 2020-10-06 22:12 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

Hi! Thanks for the patchset!

Does this patchset change format of vinyl statements on disk? If yes, I guess
it breaks backward compatibility. Did you check how does it behave when booted
from files of a previous version? Does it crash? Does it report an error? How
a user can upgrade his vinyl files to the new format?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 1/2] " Nikita Pettik
@ 2020-10-06 22:12   ` Vladislav Shpilevoy
  2020-10-09 15:06     ` Nikita Pettik
  2020-10-13 19:00   ` Aleksandr Lyapunov
  1 sibling, 1 reply; 19+ messages in thread
From: Vladislav Shpilevoy @ 2020-10-06 22:12 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

Thanks for the patch!

See 3 comments below, and my fixes on top of this branch in a
separate commit.

>  src/box/vinyl.c                 |   2 +-
>  src/box/vy_stmt.c               |  28 +-
>  src/box/vy_stmt.h               |   5 +-
>  src/box/vy_upsert.c             | 302 +++++++++++---------
>  test/unit/vy_iterators_helper.c |   2 +-
>  test/vinyl/upsert.result        | 473 ++++++++++++++++++++++++++++++++
>  test/vinyl/upsert.test.lua      | 194 +++++++++++++
>  7 files changed, 867 insertions(+), 139 deletions(-)
> 
> diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> index cee39c58c..9d9ee0613 100644
> --- a/src/box/vinyl.c
> +++ b/src/box/vinyl.c
> @@ -1976,7 +1976,7 @@ vy_lsm_upsert(struct vy_tx *tx, struct vy_lsm *lsm,
>  	operations[0].iov_base = (void *)expr;
>  	operations[0].iov_len = expr_end - expr;
>  	vystmt = vy_stmt_new_upsert(lsm->mem_format, tuple, tuple_end,
> -				    operations, 1);
> +				    operations, 1, false);

1. You don't really need that flag. In the previous review I tried to
explain it, but now I added a commit on top of your branch where I
removed this parameter. Take a look.

This argument does not have much sense. First of all, the operations array
is not really operations. Iovecs here can be separate operations; it can be
a single iovec with all operations; it can be several iovecs with first
having additional MP_ARRAY encoded before a first operation. So these are
just pieces of raw data. And you can use that. You can push a header of
these operations as a part of 'operations' array always, like I did in my
commit. No need to pass a new argument for that.

>  	if (vystmt == NULL)
>  		return -1;
>  	assert(vy_stmt_type(vystmt) == IPROTO_UPSERT);
> diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
> index e697b6321..2b5a16730 100644
> --- a/src/box/vy_upsert.c
> +++ b/src/box/vy_upsert.c
> @@ -39,39 +39,150 @@
>  #include "column_mask.h"
>  
>  /**
> - * Try to squash two upsert series (msgspacked index_base + ops)
> - * Try to create a tuple with squahed operations
> + * Check that key hasn't been changed after applying upsert operation.
> + */
> +static bool
> +vy_apply_result_does_cross_pk(struct tuple *old_stmt, const char *result,
> +			      const char *result_end, struct key_def *cmp_def,
> +			      uint64_t col_mask)
> +{
> +	if (!key_update_can_be_skipped(cmp_def->column_mask, col_mask)) {
> +		struct tuple *tuple =
> +			vy_stmt_new_replace(tuple_format(old_stmt), result,
> +					    result_end);
> +		int cmp_res = vy_stmt_compare(old_stmt, HINT_NONE, tuple,
> +					      HINT_NONE, cmp_def);
> +		tuple_unref(tuple);
> +		return cmp_res != 0;
> +	}
> +	return false;
> +}
> +
> +/**
> + * Apply update operations stored in @a upsert on tuple @a stmt. If @a stmt is
> + * void statement (i.e. it is NULL or delete statement) then operations are
> + * applied on tuple stored in @a upsert. Update operations of @a upsert which

2. Applied on tuple stored in @a upsert? Upsert logic is that the operations are
ignored, if it appears the statement is first in its key. Did you mean, that you
apply all operation sets except the first set? Below I see a comment

	/*
	 * In case upsert folds into insert, we must skip first
	 * update operations.
	 */

So this is probably it. Just the function comment is a bit misleading.

> + * can't be applied are skipped along side with other operations from single
> + * group (i.e. packed in one msgpack array); errors may be logged depending on
> + * @a suppress_error flag.
>   *
> - * @retval 0 && *result_stmt != NULL : successful squash
> - * @retval 0 && *result_stmt == NULL : unsquashable sources
> - * @retval -1 - memory error
> + * @param upsert Upsert statement to be applied on @a stmt.
> + * @param stmt Statement to be used as base for upsert operations.
> + * @param cmp_def Key definition required to provide check of primary key
> + *                modification.

3. param suppress_error is missing.

> + * @return Tuple containing result of upsert application; NULL in case OOM.
>   */
My commit on top of the branch:

====================
    [tosquash] vinyl: remove is_ops_encoded argument
    
    It was used in function vy_stmt_new_with_ops, but wasn't really
    needed. Because the function takes an array of iovec structs,
    where the caller can put the array header.

diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 9d9ee0613..ee87a6c6e 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -1972,11 +1972,15 @@ vy_lsm_upsert(struct vy_tx *tx, struct vy_lsm *lsm,
 {
 	assert(tx == NULL || tx->state == VINYL_TX_READY);
 	struct tuple *vystmt;
-	struct iovec operations[1];
-	operations[0].iov_base = (void *)expr;
-	operations[0].iov_len = expr_end - expr;
+	struct iovec operations[2];
+	/* MP_ARRAY with size 1. */
+	char header = 0x91;
+	operations[0].iov_base = &header;
+	operations[0].iov_len = 1;
+	operations[1].iov_base = (void *)expr;
+	operations[1].iov_len = expr_end - expr;
 	vystmt = vy_stmt_new_upsert(lsm->mem_format, tuple, tuple_end,
-				    operations, 1, false);
+				    operations, 2);
 	if (vystmt == NULL)
 		return -1;
 	assert(vy_stmt_type(vystmt) == IPROTO_UPSERT);
diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c
index 2ee82b3f2..92e0aa1c5 100644
--- a/src/box/vy_stmt.c
+++ b/src/box/vy_stmt.c
@@ -313,22 +313,16 @@ vy_key_dup(const char *key)
 static struct tuple *
 vy_stmt_new_with_ops(struct tuple_format *format, const char *tuple_begin,
 		     const char *tuple_end, struct iovec *ops,
-		     int op_count, enum iproto_type type, bool is_ops_encoded)
+		     int op_count, enum iproto_type type)
 {
 	mp_tuple_assert(tuple_begin, tuple_end);
 
 	const char *tmp = tuple_begin;
 	mp_decode_array(&tmp);
 
-	/*
-	 * ops are grouped in one extra array.
-	 * See vy_apply_upsert() for details.
-	 */
 	size_t ops_size = 0;
 	for (int i = 0; i < op_count; ++i)
 		ops_size += ops[i].iov_len;
-	if (!is_ops_encoded)
-		ops_size += mp_sizeof_array(op_count);
 
 	struct tuple *stmt = NULL;
 	struct region *region = &fiber()->gc;
@@ -366,8 +360,6 @@ vy_stmt_new_with_ops(struct tuple_format *format, const char *tuple_begin,
 	field_map_build(&builder, wpos - field_map_size);
 	memcpy(wpos, tuple_begin, mpsize);
 	wpos += mpsize;
-	if (!is_ops_encoded)
-		wpos = mp_encode_array(wpos, op_count);
 	for (struct iovec *op = ops, *end = ops + op_count;
 	     op != end; ++op) {
 		memcpy(wpos, op->iov_base, op->iov_len);
@@ -382,11 +374,10 @@ end:
 struct tuple *
 vy_stmt_new_upsert(struct tuple_format *format, const char *tuple_begin,
 		   const char *tuple_end, struct iovec *operations,
-		   uint32_t ops_cnt, bool is_ops_encoded)
+		   uint32_t ops_cnt)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    operations, ops_cnt, IPROTO_UPSERT,
-				    is_ops_encoded);
+				    operations, ops_cnt, IPROTO_UPSERT);
 }
 
 struct tuple *
@@ -394,7 +385,7 @@ vy_stmt_new_replace(struct tuple_format *format, const char *tuple_begin,
 		    const char *tuple_end)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    NULL, 0, IPROTO_REPLACE, true);
+				    NULL, 0, IPROTO_REPLACE);
 }
 
 struct tuple *
@@ -402,7 +393,7 @@ vy_stmt_new_insert(struct tuple_format *format, const char *tuple_begin,
 		   const char *tuple_end)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    NULL, 0, IPROTO_INSERT, true);
+				    NULL, 0, IPROTO_INSERT);
 }
 
 struct tuple *
@@ -410,7 +401,7 @@ vy_stmt_new_delete(struct tuple_format *format, const char *tuple_begin,
 		   const char *tuple_end)
 {
 	return vy_stmt_new_with_ops(format, tuple_begin, tuple_end,
-				    NULL, 0, IPROTO_DELETE, true);
+				    NULL, 0, IPROTO_DELETE);
 }
 
 struct tuple *
@@ -744,20 +735,19 @@ vy_stmt_decode(struct xrow_header *xrow, struct tuple_format *format)
 		/* Always use key format for DELETE statements. */
 		stmt = vy_stmt_new_with_ops(env->key_format,
 					    request.key, request.key_end,
-					    NULL, 0, IPROTO_DELETE, true);
+					    NULL, 0, IPROTO_DELETE);
 		break;
 	case IPROTO_INSERT:
 	case IPROTO_REPLACE:
 		stmt = vy_stmt_new_with_ops(format, request.tuple,
 					    request.tuple_end,
-					    NULL, 0, request.type, true);
+					    NULL, 0, request.type);
 		break;
 	case IPROTO_UPSERT:
 		ops.iov_base = (char *)request.ops;
 		ops.iov_len = request.ops_end - request.ops;
 		stmt = vy_stmt_new_upsert(format, request.tuple,
-					  request.tuple_end, &ops,
-					  1, true);
+					  request.tuple_end, &ops, 1);
 		break;
 	default:
 		/* TODO: report filename. */
diff --git a/src/box/vy_stmt.h b/src/box/vy_stmt.h
index 65acbecac..24c7eaad7 100644
--- a/src/box/vy_stmt.h
+++ b/src/box/vy_stmt.h
@@ -526,10 +526,11 @@ vy_stmt_new_delete(struct tuple_format *format, const char *tuple_begin,
  * @param tuple_end End of the array that begins from @param tuple_begin.
  * @param format Format of a tuple for offsets generating.
  * @param part_count Part count from key definition.
- * @param operations Vector of update operations.
+ * @param operations Vector of update operation pieces. Each iovec here may be
+ *     a part of an operation, or a whole operation, or something including
+ *     several operations. It is just a list of buffers. Each buffer is not
+ *     interpreted as an independent operation.
  * @param ops_cnt Length of the update operations vector.
- * @param is_ops_encoded True, if update operations are already packed
-  *                      into extra msgpack array.
  *
  * @retval NULL     Memory allocation error.
  * @retval not NULL Success.
@@ -537,8 +538,7 @@ vy_stmt_new_delete(struct tuple_format *format, const char *tuple_begin,
 struct tuple *
 vy_stmt_new_upsert(struct tuple_format *format,
 		   const char *tuple_begin, const char *tuple_end,
-		   struct iovec *operations, uint32_t ops_cnt,
-		   bool is_ops_encoded);
+		   struct iovec *operations, uint32_t ops_cnt);
 
 /**
  * Create REPLACE statement from UPSERT statement.
diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
index 2b5a16730..d73896cad 100644
--- a/src/box/vy_upsert.c
+++ b/src/box/vy_upsert.c
@@ -223,18 +223,23 @@ vy_apply_upsert(struct tuple *new_stmt, struct tuple *old_stmt,
 	size_t region_svp = region_used(region);
 	uint32_t old_ops_cnt = mp_decode_array(&old_ops);
 	uint32_t new_ops_cnt = mp_decode_array(&new_ops);
+	uint32_t total_ops_cnt = old_ops_cnt + new_ops_cnt;
 	size_t ops_size;
 	struct iovec *operations =
 		region_alloc_array(region, typeof(operations[0]),
-				   old_ops_cnt + new_ops_cnt, &ops_size);
+				   total_ops_cnt + 1, &ops_size);
 	if (operations == NULL) {
 		region_truncate(region, region_svp);
 		diag_set(OutOfMemory, ops_size, "region_alloc_array",
 			 "operations");
 		return NULL;
 	}
-	upsert_ops_to_iovec(old_ops, old_ops_cnt, operations);
-	upsert_ops_to_iovec(new_ops, new_ops_cnt, &operations[old_ops_cnt]);
+	char header[16];
+	char *header_end = mp_encode_array(header, total_ops_cnt);
+	operations[0].iov_base = header;
+	operations[0].iov_len = header_end - header;
+	upsert_ops_to_iovec(old_ops, old_ops_cnt, &operations[1]);
+	upsert_ops_to_iovec(new_ops, new_ops_cnt, &operations[old_ops_cnt + 1]);
 	/*
 	 * Adding update operations. We keep order of update operations in
 	 * the array the same. It is vital since first set of operations
@@ -246,8 +251,7 @@ vy_apply_upsert(struct tuple *new_stmt, struct tuple *old_stmt,
 	 * {{op1}, {op2}} update operations are not applied.
 	 */
 	result_stmt = vy_stmt_new_upsert(format, old_stmt_mp, old_stmt_mp_end,
-					 operations, old_ops_cnt + new_ops_cnt,
-					 false);
+					 operations, total_ops_cnt + 1);
 	region_truncate(region, region_svp);
 	if (result_stmt == NULL)
 		return NULL;
diff --git a/test/unit/vy_iterators_helper.c b/test/unit/vy_iterators_helper.c
index 15470920b..0d20f19ef 100644
--- a/test/unit/vy_iterators_helper.c
+++ b/test/unit/vy_iterators_helper.c
@@ -112,7 +112,7 @@ vy_new_simple_stmt(struct tuple_format *format, struct key_def *key_def,
 			ops = mp_encode_int(ops, templ->upsert_value);
 		operations[0].iov_base = tmp;
 		operations[0].iov_len = ops - tmp;
-		ret = vy_stmt_new_upsert(format, buf, pos, operations, 1, true);
+		ret = vy_stmt_new_upsert(format, buf, pos, operations, 1);
 		fail_if(ret == NULL);
 		break;
 	}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-06 22:12   ` Vladislav Shpilevoy
@ 2020-10-09 15:06     ` Nikita Pettik
  0 siblings, 0 replies; 19+ messages in thread
From: Nikita Pettik @ 2020-10-09 15:06 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 07 Oct 00:12, Vladislav Shpilevoy wrote:
> Thanks for the patch!
> 
> See 3 comments below, and my fixes on top of this branch in a
> separate commit.
> 
> >  src/box/vinyl.c                 |   2 +-
> >  src/box/vy_stmt.c               |  28 +-
> >  src/box/vy_stmt.h               |   5 +-
> >  src/box/vy_upsert.c             | 302 +++++++++++---------
> >  test/unit/vy_iterators_helper.c |   2 +-
> >  test/vinyl/upsert.result        | 473 ++++++++++++++++++++++++++++++++
> >  test/vinyl/upsert.test.lua      | 194 +++++++++++++
> >  7 files changed, 867 insertions(+), 139 deletions(-)
> > 
> > diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> > index cee39c58c..9d9ee0613 100644
> > --- a/src/box/vinyl.c
> > +++ b/src/box/vinyl.c
> > @@ -1976,7 +1976,7 @@ vy_lsm_upsert(struct vy_tx *tx, struct vy_lsm *lsm,
> >  	operations[0].iov_base = (void *)expr;
> >  	operations[0].iov_len = expr_end - expr;
> >  	vystmt = vy_stmt_new_upsert(lsm->mem_format, tuple, tuple_end,
> > -				    operations, 1);
> > +				    operations, 1, false);
> 
> 1. You don't really need that flag. In the previous review I tried to
> explain it, but now I added a commit on top of your branch where I
> removed this parameter. Take a look.
> 
> This argument does not have much sense. First of all, the operations array
> is not really operations. Iovecs here can be separate operations; it can be
> a single iovec with all operations; it can be several iovecs with first
> having additional MP_ARRAY encoded before a first operation. So these are
> just pieces of raw data. And you can use that. You can push a header of
> these operations as a part of 'operations' array always, like I did in my
> commit. No need to pass a new argument for that.

Ok, thx, squashed.
 
> >  	if (vystmt == NULL)
> >  		return -1;
> >  	assert(vy_stmt_type(vystmt) == IPROTO_UPSERT);
> > diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
> > index e697b6321..2b5a16730 100644
> > --- a/src/box/vy_upsert.c
> > +++ b/src/box/vy_upsert.c
> > @@ -39,39 +39,150 @@
> >  #include "column_mask.h"
> >  
> > +/**
> > + * Apply update operations stored in @a upsert on tuple @a stmt. If @a stmt is
> > + * void statement (i.e. it is NULL or delete statement) then operations are
> > + * applied on tuple stored in @a upsert. Update operations of @a upsert which
> 
> 2. Applied on tuple stored in @a upsert? Upsert logic is that the operations are
> ignored, if it appears the statement is first in its key. Did you mean, that you
> apply all operation sets except the first set? Below I see a comment
> 
> 	/*
> 	 * In case upsert folds into insert, we must skip first
> 	 * update operations.
> 	 */
> 
> So this is probably it. Just the function comment is a bit misleading.

A bit fixed to this version:

 * Apply update operations from @a upsert on tuple @a stmt.
 
> > + * can't be applied are skipped along side with other operations from single
> > + * group (i.e. packed in one msgpack array); errors may be logged depending on
> > + * @a suppress_error flag.
> >   *
> > - * @retval 0 && *result_stmt != NULL : successful squash
> > - * @retval 0 && *result_stmt == NULL : unsquashable sources
> > - * @retval -1 - memory error
> > + * @param upsert Upsert statement to be applied on @a stmt.
> > + * @param stmt Statement to be used as base for upsert operations.
> > + * @param cmp_def Key definition required to provide check of primary key
> > + *                modification.
> 
> 3. param suppress_error is missing.

Added description:

 * @param suppress_error If true, do not raise/log any errors.

> 
> > + * @return Tuple containing result of upsert application; NULL in case OOM.
> >   */

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-06 22:12 ` Vladislav Shpilevoy
@ 2020-10-09 15:10   ` Nikita Pettik
  2020-10-11 15:35     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 19+ messages in thread
From: Nikita Pettik @ 2020-10-09 15:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 07 Oct 00:12, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patchset!
> 
> Does this patchset change format of vinyl statements on disk? If yes, I guess
> it breaks backward compatibility. Did you check how does it behave when booted
> from files of a previous version? Does it crash? Does it report an error? How
> a user can upgrade his vinyl files to the new format?

No, everything goes fine except the fact that upsert statements restored
from disk have ops encoded into single array and as a result being applied
may break associative property (like it was before this patch). I've tried
to add a test, but since squash operation modifies content of run/xlog files,
such test can be run only once. If it is vital, I guess it is possible to
copy data directory before each test run..

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-09 15:10   ` Nikita Pettik
@ 2020-10-11 15:35     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 19+ messages in thread
From: Vladislav Shpilevoy @ 2020-10-11 15:35 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

>> Does this patchset change format of vinyl statements on disk? If yes, I guess
>> it breaks backward compatibility. Did you check how does it behave when booted
>> from files of a previous version? Does it crash? Does it report an error? How
>> a user can upgrade his vinyl files to the new format?
> 
> No, everything goes fine except the fact that upsert statements restored
> from disk have ops encoded into single array and as a result being applied
> may break associative property (like it was before this patch). I've tried
> to add a test, but since squash operation modifies content of run/xlog files,
> such test can be run only once. If it is vital, I guess it is possible to
> copy data directory before each test run..

Yes, it is possible. At least it would be better than not having a test at
all.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-03 13:28 [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
                   ` (3 preceding siblings ...)
  2020-10-06 22:12 ` Vladislav Shpilevoy
@ 2020-10-11 15:35 ` Vladislav Shpilevoy
  2020-10-13 10:18   ` Nikita Pettik
  4 siblings, 1 reply; 19+ messages in thread
From: Vladislav Shpilevoy @ 2020-10-11 15:35 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

The patchset looks good except for not having a backward compatibility test.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-11 15:35 ` Vladislav Shpilevoy
@ 2020-10-13 10:18   ` Nikita Pettik
  2020-10-14 23:44     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 19+ messages in thread
From: Nikita Pettik @ 2020-10-13 10:18 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 11 Oct 17:35, Vladislav Shpilevoy wrote:
> The patchset looks good except for not having a backward compatibility test.

Here's a test:

diff --git a/test/vinyl/upgrade/upsert/00000000000000000000.vylog b/test/vinyl/upgrade/upsert/00000000000000000000.vylog
new file mode 100644
index 000000000..581ad9b53
Binary files /dev/null and b/test/vinyl/upgrade/upsert/00000000000000000000.vylog differ
diff --git a/test/vinyl/upgrade/upsert/00000000000000000004.vylog b/test/vinyl/upgrade/upsert/00000000000000000004.vylog
new file mode 100644
index 000000000..5194b1457
Binary files /dev/null and b/test/vinyl/upgrade/upsert/00000000000000000004.vylog differ
diff --git a/test/vinyl/upgrade/upsert/00000000000000000004.xlog b/test/vinyl/upgrade/upsert/00000000000000000004.xlog
new file mode 100644
index 000000000..5725d09ce
Binary files /dev/null and b/test/vinyl/upgrade/upsert/00000000000000000004.xlog differ
diff --git a/test/vinyl/upgrade/upsert/00000000000000000010.vylog b/test/vinyl/upgrade/upsert/00000000000000000010.vylog
new file mode 100644
index 000000000..954a51a4c
Binary files /dev/null and b/test/vinyl/upgrade/upsert/00000000000000000010.vylog differ
diff --git a/test/vinyl/upgrade/upsert/00000000000000000010.xlog b/test/vinyl/upgrade/upsert/00000000000000000010.xlog
new file mode 100644
index 000000000..f5921223d
--- /dev/null
+++ b/test/vinyl/upgrade/upsert/00000000000000000010.xlog
@@ -0,0 +1,8 @@
+XLOG
+0.13
+Version: 2.6.0-153-g3bc455f7d
+Instance: 4b1aba36-0cbc-4d51-9968-53da3908a43d
+VClock: {1: 10}
+PrevVClock: {1: 4}
+
+<D5>^P<AD><ED>
\ No newline at end of file
diff --git a/test/vinyl/upgrade/upsert/512/0/00000000000000000002.index b/test/vinyl/upgrade/upsert/512/0/00000000000000000002.index
new file mode 100644
index 000000000..156c23dd7
Binary files /dev/null and b/test/vinyl/upgrade/upsert/512/0/00000000000000000002.index differ
diff --git a/test/vinyl/upgrade/upsert/512/0/00000000000000000002.run b/test/vinyl/upgrade/upsert/512/0/00000000000000000002.run
new file mode 100644
index 000000000..9757ea341
Binary files /dev/null and b/test/vinyl/upgrade/upsert/512/0/00000000000000000002.run differ
diff --git a/test/vinyl/upgrade/upsert/512/0/00000000000000000004.index b/test/vinyl/upgrade/upsert/512/0/00000000000000000004.index
new file mode 100644
index 000000000..1c3896547
Binary files /dev/null and b/test/vinyl/upgrade/upsert/512/0/00000000000000000004.index differ
diff --git a/test/vinyl/upgrade/upsert/512/0/00000000000000000004.run b/test/vinyl/upgrade/upsert/512/0/00000000000000000004.run
new file mode 100644
index 000000000..469e2abab
Binary files /dev/null and b/test/vinyl/upgrade/upsert/512/0/00000000000000000004.run differ
diff --git a/test/vinyl/upgrade/upsert/512/0/00000000000000000006.index b/test/vinyl/upgrade/upsert/512/0/00000000000000000006.index
new file mode 100644
index 000000000..9202ec7e7
Binary files /dev/null and b/test/vinyl/upgrade/upsert/512/0/00000000000000000006.index differ
diff --git a/test/vinyl/upgrade/upsert/512/0/00000000000000000006.run b/test/vinyl/upgrade/upsert/512/0/00000000000000000006.run
new file mode 100644
index 000000000..d4fcd7599
Binary files /dev/null and b/test/vinyl/upgrade/upsert/512/0/00000000000000000006.run differ
diff --git a/test/vinyl/upsert_upgrade.result b/test/vinyl/upsert_upgrade.result
new file mode 100644
index 000000000..8882a8b63
--- /dev/null
+++ b/test/vinyl/upsert_upgrade.result
@@ -0,0 +1,59 @@
+-- test-run result file version 2
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+-- Upsert's internal format have changed: now update operations are stored
+-- with additional map package. Let's test backward compatibility.
+-- Snapshot (i.e. run files) contain following statements:
+
+-- s = box.schema.create_space('test', {engine = 'vinyl'})
+-- pk = s:create_index('pk')
+-- s:insert({1, 2})
+-- box.snapshot()
+-- s:upsert({1, 0}, {{'+', 2, 1}})
+-- s:upsert({1, 0}, {{'-', 2, 2}})
+-- s:upsert({2, 0}, {{'+', 2, 1}})
+-- s:upsert({2, 0}, {{'-', 2, 2}})
+-- s:upsert({1, 0}, {{'=', 2, 2}})
+-- s:upsert({1, 0}, {{'-', 2, 2}})
+-- box.snapshot()
+--
+-- Make sure that upserts will be parsed and squashed correctly.
+--
+
+dst_dir = 'vinyl/upgrade/upsert/'
+ | ---
+ | ...
+
+test_run:cmd('create server upgrade with script="vinyl/upgrade.lua", workdir="' .. dst_dir .. '"')
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server upgrade')
+ | ---
+ | - true
+ | ...
+test_run:switch('upgrade')
+ | ---
+ | - true
+ | ...
+
+box.space.test:select()
+ | ---
+ | - - [1, 0]
+ |   - [2, -2]
+ | ...
+
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+test_run:cmd('stop server upgrade')
+ | ---
+ | - true
+ | ...
+test_run:cmd('cleanup server upgrade')
+ | ---
+ | - true
+ | ...
diff --git a/test/vinyl/upsert_upgrade.test.lua b/test/vinyl/upsert_upgrade.test.lua
new file mode 100644
index 000000000..db409e2dd
--- /dev/null
+++ b/test/vinyl/upsert_upgrade.test.lua
@@ -0,0 +1,32 @@
+test_run = require('test_run').new()
+
+-- Upsert's internal format have changed: now update operations are stored
+-- with additional map package. Let's test backward compatibility.
+-- Snapshot (i.e. run files) contain following statements:
+
+-- s = box.schema.create_space('test', {engine = 'vinyl'})
+-- pk = s:create_index('pk')
+-- s:insert({1, 2})
+-- box.snapshot()
+-- s:upsert({1, 0}, {{'+', 2, 1}})
+-- s:upsert({1, 0}, {{'-', 2, 2}})
+-- s:upsert({2, 0}, {{'+', 2, 1}})
+-- s:upsert({2, 0}, {{'-', 2, 2}})
+-- s:upsert({1, 0}, {{'=', 2, 2}})
+-- s:upsert({1, 0}, {{'-', 2, 2}})
+-- box.snapshot()
+--
+-- Make sure that upserts will be parsed and squashed correctly.
+--
+
+dst_dir = 'vinyl/upgrade/upsert/'
+
+test_run:cmd('create server upgrade with script="vinyl/upgrade.lua", workdir="' .. dst_dir .. '"')
+test_run:cmd('start server upgrade')
+test_run:switch('upgrade')
+
+box.space.test:select()
+
+test_run:switch('default')
+test_run:cmd('stop server upgrade')
+test_run:cmd('cleanup server upgrade')

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 1/2] " Nikita Pettik
  2020-10-06 22:12   ` Vladislav Shpilevoy
@ 2020-10-13 19:00   ` Aleksandr Lyapunov
  2020-10-14  0:15     ` Nikita Pettik
  1 sibling, 1 reply; 19+ messages in thread
From: Aleksandr Lyapunov @ 2020-10-13 19:00 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches; +Cc: v.shpilevoy

Hi, thanks for the patch! Se my 2 comments below:
(the second seems to be important)
Once they resolved, the whole patch is LGTM!

On 10/3/20 4:28 PM, Nikita Pettik wrote:
> into insert the upsert's tuple and the tuple stored in index can be
> different In our particular case, tuple stored on disk has two fields
Dot before `In` is missing.
> +		/*
> +		 * If it turns out that resulting tuple modifies primary
> +		 * key, then simply ignore this upsert.
> +		 */
> +		if (stmt != NULL &&
> +		    vy_apply_result_does_cross_pk(stmt, exec_res,
> +						  exec_res + mp_size, cmp_def,
> +						  column_mask)) {
I guess in case if `stmt_is_void = true` we must compare with `upsert` 
tuple,
not with `stmt` tuple.
> +			if (!suppress_error) {
> +				say_error("upsert operations %s are not applied"\
> +					  " due to primary key modification",
> +					  mp_str(ups_ops));
> +			}
> +			ups_ops = ups_ops_end;
> +			continue;
> +		}
> +		ups_ops = ups_ops_end;
> +		/*
> +		 * Result statement must satisfy space's format. Since upsert's
> +		 * tuple correctness is already checked in vy_upsert(), let's
> +		 * use its format to provide result verification.
> +		 */
> +		struct tuple_format *format = tuple_format(upsert);
> +		if (tuple_validate_raw(format, exec_res) != 0) {
> +			if (! suppress_error)
> +				diag_log();
> +			continue;
> +		}
> +		result_mp = exec_res;
> +		result_mp_end = exec_res + mp_size;
> +	}
> +	struct tuple *new_terminal_stmt = vy_stmt_new_replace(format, result_mp,
> +							      result_mp_end);
> +	region_truncate(region, region_svp);
> +	if (new_terminal_stmt == NULL)
> +		return NULL;
> +	vy_stmt_set_lsn(new_terminal_stmt, vy_stmt_lsn(upsert));
> +	return new_terminal_stmt;
> +}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-13 19:00   ` Aleksandr Lyapunov
@ 2020-10-14  0:15     ` Nikita Pettik
  2020-10-14  8:58       ` Aleksandr Lyapunov
  0 siblings, 1 reply; 19+ messages in thread
From: Nikita Pettik @ 2020-10-14  0:15 UTC (permalink / raw)
  To: Aleksandr Lyapunov; +Cc: tarantool-patches, v.shpilevoy

On 13 Oct 22:00, Aleksandr Lyapunov wrote:
> Hi, thanks for the patch! Se my 2 comments below:
> (the second seems to be important)
> Once they resolved, the whole patch is LGTM!
> 
> On 10/3/20 4:28 PM, Nikita Pettik wrote:
> > into insert the upsert's tuple and the tuple stored in index can be
> > different In our particular case, tuple stored on disk has two fields
> Dot before `In` is missing.

Fixed.

> > +		/*
> > +		 * If it turns out that resulting tuple modifies primary
> > +		 * key, then simply ignore this upsert.
> > +		 */
> > +		if (stmt != NULL &&
> > +		    vy_apply_result_does_cross_pk(stmt, exec_res,
> > +						  exec_res + mp_size, cmp_def,
> > +						  column_mask)) {
> I guess in case if `stmt_is_void = true` we must compare with `upsert`
> tuple,
> not with `stmt` tuple.

Yep, thanks for input. Fixed and addded corresponding tests:

diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
index 090f3304b..0b4baf6ff 100644
--- a/src/box/vy_upsert.c
+++ b/src/box/vy_upsert.c
@@ -102,11 +102,15 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
        const char *ups_ops = new_ops;
        /*
         * In case upsert folds into insert, we must skip first
-        * update operations.
+        * update operations. Moreover, we should create (in case of delete
+        * statement - re-create since delete contains only key) tuple with
+        * format to use it for PK modification check.
         */
        if (stmt_is_void) {
                ups_cnt--;
                mp_next(&ups_ops);
+               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
+                                          result_mp_end);
        }
        for (uint32_t i = 0; i < ups_cnt; ++i) {
                assert(mp_typeof(*ups_ops) == MP_ARRAY);
@@ -122,6 +126,8 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
                                assert(e != NULL);
                                /* Bail out immediately in case of OOM. */
                                if (e->type != &type_ClientError) {
+                                       if (stmt_is_void)
+                                               tuple_unref(stmt);
                                        region_truncate(region, region_svp);
                                        return NULL;
                                }
@@ -134,8 +140,7 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
                 * If it turns out that resulting tuple modifies primary
                 * key, then simply ignore this upsert.
                 */
-               if (stmt != NULL &&
-                   vy_apply_result_does_cross_pk(stmt, exec_res,
+               if (vy_apply_result_does_cross_pk(stmt, exec_res,
                                                  exec_res + mp_size, cmp_def,
                                                  column_mask)) {
                        if (!suppress_error) {
@@ -164,6 +169,8 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
        struct tuple *new_terminal_stmt = vy_stmt_new_replace(format, result_mp,
                                                              result_mp_end);
        region_truncate(region, region_svp);
+       if (stmt_is_void)
+               tuple_unref(stmt);
        if (new_terminal_stmt == NULL)
                return NULL;
        vy_stmt_set_lsn(new_terminal_stmt, vy_stmt_lsn(upsert));
diff --git a/test/vinyl/upsert.result b/test/vinyl/upsert.result
index a66e87be0..fe673ad6f 100644
--- a/test/vinyl/upsert.result
+++ b/test/vinyl/upsert.result
@@ -1372,3 +1372,97 @@ s:select{}
 s:drop()
 ---
 ...
+-- Combination of upserts and underlying void (i.e. delete or null)
+-- statement on disk. Upsert modifying PK is skipped.
+--
+s = box.schema.space.create('test', {engine = 'vinyl'})
+---
+...
+i = s:create_index('test', { run_count_per_level = 20 })
+---
+...
+for i = 101, 110 do s:replace{i, i} end
+---
+...
+s:replace({1, 1})
+---
+- [1, 1]
+...
+box.snapshot()
+---
+- ok
+...
+s:delete({1})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1, 1}, {{'=', 2, 2}})
+---
+...
+s:upsert({1, 1}, {{'=', 1, 0}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 1]
+  - [101, 101]
+  - [102, 102]
+  - [103, 103]
+  - [104, 104]
+  - [105, 105]
+  - [106, 106]
+  - [107, 107]
+  - [108, 108]
+  - [109, 109]
+  - [110, 110]
+...
+s:drop()
+---
+...
+s = box.schema.space.create('test', {engine = 'vinyl'})
+---
+...
+i = s:create_index('test', { run_count_per_level = 20 })
+---
+...
+for i = 101, 110 do s:replace{i, i} end
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:upsert({1, 1}, {{'=', 2, 2}})
+---
+...
+s:upsert({1, 1}, {{'=', 1, 0}})
+---
+...
+box.snapshot()
+---
+- ok
+...
+s:select()
+---
+- - [1, 1]
+  - [101, 101]
+  - [102, 102]
+  - [103, 103]
+  - [104, 104]
+  - [105, 105]
+  - [106, 106]
+  - [107, 107]
+  - [108, 108]
+  - [109, 109]
+  - [110, 110]
+...
+s:drop()
+---
+...
diff --git a/test/vinyl/upsert.test.lua b/test/vinyl/upsert.test.lua
index ad69f4b72..b62c19978 100644
--- a/test/vinyl/upsert.test.lua
+++ b/test/vinyl/upsert.test.lua
@@ -566,3 +566,33 @@ box.snapshot()
 s:select{}
 
 s:drop()
+
+-- Combination of upserts and underlying void (i.e. delete or null)
+-- statement on disk. Upsert modifying PK is skipped.
+--
+s = box.schema.space.create('test', {engine = 'vinyl'})
+i = s:create_index('test', { run_count_per_level = 20 })
+
+for i = 101, 110 do s:replace{i, i} end
+s:replace({1, 1})
+box.snapshot()
+s:delete({1})
+box.snapshot()
+s:upsert({1, 1}, {{'=', 2, 2}})
+s:upsert({1, 1}, {{'=', 1, 0}})
+box.snapshot()
+s:select()
+
+s:drop()
+
+s = box.schema.space.create('test', {engine = 'vinyl'})
+i = s:create_index('test', { run_count_per_level = 20 })
+
+for i = 101, 110 do s:replace{i, i} end
+box.snapshot()
+s:upsert({1, 1}, {{'=', 2, 2}})
+s:upsert({1, 1}, {{'=', 1, 0}})
+box.snapshot()
+s:select()
+
+s:drop()


> > +			if (!suppress_error) {
> > +				say_error("upsert operations %s are not applied"\
> > +					  " due to primary key modification",
> > +					  mp_str(ups_ops));
> > +			}
> > +			ups_ops = ups_ops_end;
> > +			continue;
> > +		}
> > +		ups_ops = ups_ops_end;
> > +		/*
> > +		 * Result statement must satisfy space's format. Since upsert's
> > +		 * tuple correctness is already checked in vy_upsert(), let's
> > +		 * use its format to provide result verification.
> > +		 */
> > +		struct tuple_format *format = tuple_format(upsert);
> > +		if (tuple_validate_raw(format, exec_res) != 0) {
> > +			if (! suppress_error)
> > +				diag_log();
> > +			continue;
> > +		}
> > +		result_mp = exec_res;
> > +		result_mp_end = exec_res + mp_size;
> > +	}
> > +	struct tuple *new_terminal_stmt = vy_stmt_new_replace(format, result_mp,
> > +							      result_mp_end);
> > +	region_truncate(region, region_svp);
> > +	if (new_terminal_stmt == NULL)
> > +		return NULL;
> > +	vy_stmt_set_lsn(new_terminal_stmt, vy_stmt_lsn(upsert));
> > +	return new_terminal_stmt;
> > +}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-14  0:15     ` Nikita Pettik
@ 2020-10-14  8:58       ` Aleksandr Lyapunov
  2020-10-14 10:42         ` Nikita Pettik
  0 siblings, 1 reply; 19+ messages in thread
From: Aleksandr Lyapunov @ 2020-10-14  8:58 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy


>          if (stmt_is_void) {
>                  ups_cnt--;
>                  mp_next(&ups_ops);
> +               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
> +                                          result_mp_end);
Why can't you just use `upsert` tuple itself instead of created tuple by 
its format and data?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-14  8:58       ` Aleksandr Lyapunov
@ 2020-10-14 10:42         ` Nikita Pettik
  2020-10-14 11:47           ` Aleksandr Lyapunov
  0 siblings, 1 reply; 19+ messages in thread
From: Nikita Pettik @ 2020-10-14 10:42 UTC (permalink / raw)
  To: Aleksandr Lyapunov; +Cc: tarantool-patches, v.shpilevoy

On 14 Oct 11:58, Aleksandr Lyapunov wrote:
> 
> >          if (stmt_is_void) {
> >                  ups_cnt--;
> >                  mp_next(&ups_ops);
> > +               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
> > +                                          result_mp_end);
> Why can't you just use `upsert` tuple itself instead of created tuple by its
> format and data?
>

For sure we can. Sorry it was late night so I didn't pay attention to this fact.
Diff:

diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
index 0b4baf6ff..fdae931f6 100644
--- a/src/box/vy_upsert.c
+++ b/src/box/vy_upsert.c
@@ -102,15 +102,13 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
        const char *ups_ops = new_ops;
        /*
         * In case upsert folds into insert, we must skip first
-        * update operations. Moreover, we should create (in case of delete
-        * statement - re-create since delete contains only key) tuple with
-        * format to use it for PK modification check.
+        * update operations. Moreover, we should use upsert's tuple
+        * to provide PK modification check.
         */
        if (stmt_is_void) {
                ups_cnt--;
                mp_next(&ups_ops);
-               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
-                                          result_mp_end);
+               stmt = upsert;
        }
        for (uint32_t i = 0; i < ups_cnt; ++i) {
                assert(mp_typeof(*ups_ops) == MP_ARRAY);
@@ -126,8 +124,6 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
                                assert(e != NULL);
                                /* Bail out immediately in case of OOM. */
                                if (e->type != &type_ClientError) {
-                                       if (stmt_is_void)
-                                               tuple_unref(stmt);
                                        region_truncate(region, region_svp);
                                        return NULL;
                                }
@@ -169,8 +165,6 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
        struct tuple *new_terminal_stmt = vy_stmt_new_replace(format, result_mp,
                                                              result_mp_end);
        region_truncate(region, region_svp);
-       if (stmt_is_void)
-               tuple_unref(stmt);
        if (new_terminal_stmt == NULL)
                return NULL;
        vy_stmt_set_lsn(new_terminal_stmt, vy_stmt_lsn(upsert));

 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-14 10:42         ` Nikita Pettik
@ 2020-10-14 11:47           ` Aleksandr Lyapunov
  2020-10-15  0:36             ` Nikita Pettik
  0 siblings, 1 reply; 19+ messages in thread
From: Aleksandr Lyapunov @ 2020-10-14 11:47 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy

Great, LGTM!

On 10/14/20 1:42 PM, Nikita Pettik wrote:
> On 14 Oct 11:58, Aleksandr Lyapunov wrote:
>>>           if (stmt_is_void) {
>>>                   ups_cnt--;
>>>                   mp_next(&ups_ops);
>>> +               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
>>> +                                          result_mp_end);
>> Why can't you just use `upsert` tuple itself instead of created tuple by its
>> format and data?
>>
> For sure we can. Sorry it was late night so I didn't pay attention to this fact.
> Diff:
>
> diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
> index 0b4baf6ff..fdae931f6 100644
> --- a/src/box/vy_upsert.c
> +++ b/src/box/vy_upsert.c
> @@ -102,15 +102,13 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
>          const char *ups_ops = new_ops;
>          /*
>           * In case upsert folds into insert, we must skip first
> -        * update operations. Moreover, we should create (in case of delete
> -        * statement - re-create since delete contains only key) tuple with
> -        * format to use it for PK modification check.
> +        * update operations. Moreover, we should use upsert's tuple
> +        * to provide PK modification check.
>           */
>          if (stmt_is_void) {
>                  ups_cnt--;
>                  mp_next(&ups_ops);
> -               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
> -                                          result_mp_end);
> +               stmt = upsert;
>          }
>          for (uint32_t i = 0; i < ups_cnt; ++i) {
>                  assert(mp_typeof(*ups_ops) == MP_ARRAY);
> @@ -126,8 +124,6 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
>                                  assert(e != NULL);
>                                  /* Bail out immediately in case of OOM. */
>                                  if (e->type != &type_ClientError) {
> -                                       if (stmt_is_void)
> -                                               tuple_unref(stmt);
>                                          region_truncate(region, region_svp);
>                                          return NULL;
>                                  }
> @@ -169,8 +165,6 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
>          struct tuple *new_terminal_stmt = vy_stmt_new_replace(format, result_mp,
>                                                                result_mp_end);
>          region_truncate(region, region_svp);
> -       if (stmt_is_void)
> -               tuple_unref(stmt);
>          if (new_terminal_stmt == NULL)
>                  return NULL;
>          vy_stmt_set_lsn(new_terminal_stmt, vy_stmt_lsn(upsert));
>
>   

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-13 10:18   ` Nikita Pettik
@ 2020-10-14 23:44     ` Vladislav Shpilevoy
  2020-10-15  1:42       ` Nikita Pettik
  0 siblings, 1 reply; 19+ messages in thread
From: Vladislav Shpilevoy @ 2020-10-14 23:44 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

Hi! Thanks for the patch!

The test looks fine, but hard to extend. I would better try to
organize a more extendible folder structure, like test/xlog/upgrade
does. With the current 'flat' structure we will have problems with
adding more tests for other versions in the future.

I see you already pushed it, so I am too late.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/2] vinyl: rework upsert operation
  2020-10-14 11:47           ` Aleksandr Lyapunov
@ 2020-10-15  0:36             ` Nikita Pettik
  0 siblings, 0 replies; 19+ messages in thread
From: Nikita Pettik @ 2020-10-15  0:36 UTC (permalink / raw)
  To: Aleksandr Lyapunov; +Cc: tarantool-patches, v.shpilevoy

On 14 Oct 14:47, Aleksandr Lyapunov wrote:
> Great, LGTM!

Pushed to master, changelog is updated, branch is dropped.
 
> On 10/14/20 1:42 PM, Nikita Pettik wrote:
> > On 14 Oct 11:58, Aleksandr Lyapunov wrote:
> > > >           if (stmt_is_void) {
> > > >                   ups_cnt--;
> > > >                   mp_next(&ups_ops);
> > > > +               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
> > > > +                                          result_mp_end);
> > > Why can't you just use `upsert` tuple itself instead of created tuple by its
> > > format and data?
> > > 
> > For sure we can. Sorry it was late night so I didn't pay attention to this fact.
> > Diff:
> > 
> > diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
> > index 0b4baf6ff..fdae931f6 100644
> > --- a/src/box/vy_upsert.c
> > +++ b/src/box/vy_upsert.c
> > @@ -102,15 +102,13 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
> >          const char *ups_ops = new_ops;
> >          /*
> >           * In case upsert folds into insert, we must skip first
> > -        * update operations. Moreover, we should create (in case of delete
> > -        * statement - re-create since delete contains only key) tuple with
> > -        * format to use it for PK modification check.
> > +        * update operations. Moreover, we should use upsert's tuple
> > +        * to provide PK modification check.
> >           */
> >          if (stmt_is_void) {
> >                  ups_cnt--;
> >                  mp_next(&ups_ops);
> > -               stmt = vy_stmt_new_replace(tuple_format(upsert), result_mp,
> > -                                          result_mp_end);
> > +               stmt = upsert;
> >          }
> >          for (uint32_t i = 0; i < ups_cnt; ++i) {
> >                  assert(mp_typeof(*ups_ops) == MP_ARRAY);
> > @@ -126,8 +124,6 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
> >                                  assert(e != NULL);
> >                                  /* Bail out immediately in case of OOM. */
> >                                  if (e->type != &type_ClientError) {
> > -                                       if (stmt_is_void)
> > -                                               tuple_unref(stmt);
> >                                          region_truncate(region, region_svp);
> >                                          return NULL;
> >                                  }
> > @@ -169,8 +165,6 @@ vy_apply_upsert_on_terminal_stmt(struct tuple *upsert, struct tuple *stmt,
> >          struct tuple *new_terminal_stmt = vy_stmt_new_replace(format, result_mp,
> >                                                                result_mp_end);
> >          region_truncate(region, region_svp);
> > -       if (stmt_is_void)
> > -               tuple_unref(stmt);
> >          if (new_terminal_stmt == NULL)
> >                  return NULL;
> >          vy_stmt_set_lsn(new_terminal_stmt, vy_stmt_lsn(upsert));
> > 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation
  2020-10-14 23:44     ` Vladislav Shpilevoy
@ 2020-10-15  1:42       ` Nikita Pettik
  0 siblings, 0 replies; 19+ messages in thread
From: Nikita Pettik @ 2020-10-15  1:42 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 15 Oct 01:44, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
> 
> The test looks fine, but hard to extend. I would better try to
> organize a more extendible folder structure, like test/xlog/upgrade
> does. With the current 'flat' structure we will have problems with
> adding more tests for other versions in the future.
> 
> I see you already pushed it, so I am too late.

Np, I'll do it tomorrow as follow-up. I've also found that
vinyl/upgrade.test.lua is disabled and seems there's no corresponding
snap/vylog files to run this test at all..

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-10-15  1:42 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-03 13:28 [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 1/2] " Nikita Pettik
2020-10-06 22:12   ` Vladislav Shpilevoy
2020-10-09 15:06     ` Nikita Pettik
2020-10-13 19:00   ` Aleksandr Lyapunov
2020-10-14  0:15     ` Nikita Pettik
2020-10-14  8:58       ` Aleksandr Lyapunov
2020-10-14 10:42         ` Nikita Pettik
2020-10-14 11:47           ` Aleksandr Lyapunov
2020-10-15  0:36             ` Nikita Pettik
2020-10-03 13:28 ` [Tarantool-patches] [PATCH v3 2/2] vinyl: remove squash procedures from source code Nikita Pettik
2020-10-03 13:52 ` [Tarantool-patches] [PATCH v3 0/2] vinyl: rework upsert operation Nikita Pettik
2020-10-06 22:12 ` Vladislav Shpilevoy
2020-10-09 15:10   ` Nikita Pettik
2020-10-11 15:35     ` Vladislav Shpilevoy
2020-10-11 15:35 ` Vladislav Shpilevoy
2020-10-13 10:18   ` Nikita Pettik
2020-10-14 23:44     ` Vladislav Shpilevoy
2020-10-15  1:42       ` Nikita Pettik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox