Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
@ 2020-04-13 21:55 Nikita Pettik
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied Nikita Pettik
                   ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Nikita Pettik @ 2020-04-13 21:55 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

Branch: https://github.com/tarantool/tarantool/tree/np/gh-1622-skip-invalid-upserts
Issue: https://github.com/tarantool/tarantool/issues/1622

In fact, the first patch in series does not make much sense without
second since otherwise dump or compaction processes (and any other
operations that read data as well) will fail whenever they are started.
But still I've divided fix into two patches for the sake of review
simplicity.

Nikita Pettik (2):
  vinyl: validate resulting tuple after upsert is applied
  vinyl: skip invalid upserts during squash

 src/box/vy_history.c                          |  20 ++-
 src/box/vy_lsm.c                              |  13 +-
 src/box/vy_tx.c                               |  29 ++--
 src/box/vy_upsert.c                           |   4 +
 src/box/vy_write_iterator.c                   |  34 +++--
 .../vinyl/gh-1622-skip-invalid-upserts.result | 135 ++++++++++++++++++
 .../gh-1622-skip-invalid-upserts.test.lua     |  50 +++++++
 7 files changed, 258 insertions(+), 27 deletions(-)
 create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.result
 create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.test.lua

-- 
2.17.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied
  2020-04-13 21:55 [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Nikita Pettik
@ 2020-04-13 21:55 ` Nikita Pettik
  2020-06-22 19:28   ` Aleksandr Lyapunov
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash Nikita Pettik
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-04-13 21:55 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

There's no check that the result of upsert squashing will feature
correct format. As a consequence one is able to get tuples in space
which do not respect format. For instance:

box.schema.space.create('vinyl',{engine='vinyl',field_count=1})
box.space.vinyl:insert{1}
box.space.vinyl:upsert({1},{{'=',2,5}})

The last statement does not raise any errors. So upsert is applied and
now there's [1, 5] tuple in space (which violates 'field_count' format
restriction).

To avoid such situations, let's validate result of upsert application
and check format of resulting tuple.

Part of #1622
---
 src/box/vy_upsert.c                           |  4 +++
 .../vinyl/gh-1622-skip-invalid-upserts.result | 26 +++++++++++++++++++
 .../gh-1622-skip-invalid-upserts.test.lua     | 11 ++++++++
 3 files changed, 41 insertions(+)
 create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.result
 create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.test.lua

diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
index ebea2789c..6855b9820 100644
--- a/src/box/vy_upsert.c
+++ b/src/box/vy_upsert.c
@@ -134,6 +134,10 @@ vy_apply_upsert(const struct tuple *new_stmt, const struct tuple *old_stmt,
 					 &mp_size, 0, suppress_error,
 					 &column_mask);
 	result_mp_end = result_mp + mp_size;
+	if (tuple_validate_raw(format, result_mp) != 0) {
+		region_truncate(region, region_svp);
+		return NULL;
+	}
 	if (old_type != IPROTO_UPSERT) {
 		assert(old_type == IPROTO_INSERT ||
 		       old_type == IPROTO_REPLACE);
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.result b/test/vinyl/gh-1622-skip-invalid-upserts.result
new file mode 100644
index 000000000..437ff3c51
--- /dev/null
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.result
@@ -0,0 +1,26 @@
+-- test-run result file version 2
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+ | ---
+ | ...
+pk = s:create_index('pk')
+ | ---
+ | ...
+s:replace{1, 1}
+ | ---
+ | - [1, 1]
+ | ...
+-- Error is logged, upsert is not applied.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+ | ---
+ | ...
+-- Invalid upsert still appears during read.
+--
+s:select{}
+ | ---
+ | - error: Tuple field count 3 does not match space field count 2
+ | ...
+
+s:drop()
+ | ---
+ | ...
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
new file mode 100644
index 000000000..952d2bcde
--- /dev/null
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
@@ -0,0 +1,11 @@
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+pk = s:create_index('pk')
+s:replace{1, 1}
+-- Error is logged, upsert is not applied.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+-- Invalid upsert still appears during read.
+--
+s:select{}
+
+s:drop()
\ No newline at end of file
-- 
2.17.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-04-13 21:55 [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Nikita Pettik
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied Nikita Pettik
@ 2020-04-13 21:55 ` Nikita Pettik
  2020-04-13 22:12   ` Konstantin Osipov
  2020-05-01  0:31   ` Vladislav Shpilevoy
  2020-05-20 22:13 ` [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Vladislav Shpilevoy
  2020-06-02 21:36 ` Vladislav Shpilevoy
  3 siblings, 2 replies; 29+ messages in thread
From: Nikita Pettik @ 2020-04-13 21:55 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

Instead of aborting merge sequence of upserts let's log appeared
errors and skip upserts which can't be applied. It makes sense
taking into consideration previous commit: now upsert statements which
can't be applied may appear in mems/runs (previously squash operation
might fail only due to OOM). As a result, if we didn't ignore invalid
upserts, dump or compaction processes would not be able to finish (owing
to inability to squash upserts).

Closes #1622
---
 src/box/vy_history.c                          |  20 +++-
 src/box/vy_lsm.c                              |  13 +-
 src/box/vy_tx.c                               |  29 +++--
 src/box/vy_write_iterator.c                   |  34 ++++--
 .../vinyl/gh-1622-skip-invalid-upserts.result | 113 +++++++++++++++++-
 .../gh-1622-skip-invalid-upserts.test.lua     |  41 ++++++-
 6 files changed, 220 insertions(+), 30 deletions(-)

diff --git a/src/box/vy_history.c b/src/box/vy_history.c
index 0f3b71195..55eb3c669 100644
--- a/src/box/vy_history.c
+++ b/src/box/vy_history.c
@@ -102,12 +102,20 @@ vy_history_apply(struct vy_history *history, struct key_def *cmp_def,
 	while (node != NULL) {
 		struct tuple *stmt = vy_apply_upsert(node->stmt, curr_stmt,
 						     cmp_def, format, true);
-		++*upserts_applied;
-		if (curr_stmt != NULL)
-			tuple_unref(curr_stmt);
-		if (stmt == NULL)
-			return -1;
-		curr_stmt = stmt;
+		if (stmt == NULL) {
+			/*
+			 * In case statement hasn't been applied,
+			 * simply skip it ignoring errors (otherwise
+			 * error will appear during tuple read).
+			 */
+			assert(diag_last_error(diag_get()) != NULL);
+			diag_clear(diag_get());
+		} else {
+			++*upserts_applied;
+			if (curr_stmt != NULL)
+				tuple_unref(curr_stmt);
+			curr_stmt = stmt;
+		}
 		node = rlist_prev_entry_safe(node, &history->stmts, link);
 	}
 	*ret = curr_stmt;
diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
index 04c9926a8..3d630fc01 100644
--- a/src/box/vy_lsm.c
+++ b/src/box/vy_lsm.c
@@ -984,13 +984,20 @@ vy_lsm_commit_upsert(struct vy_lsm *lsm, struct vy_mem *mem,
 		struct tuple *upserted =
 			vy_apply_upsert(stmt, older, lsm->cmp_def,
 					lsm->mem_format, false);
-		lsm->stat.upsert.applied++;
-
 		if (upserted == NULL) {
-			/* OOM */
+			/*
+			 * OOM or upserted tuple does not fulfill
+			 * space format. Hence, upsert can't be
+			 * transformed into replace. Log error
+			 * and exit.
+			 */
+			struct error *e = diag_last_error(diag_get());
+			assert(e != NULL);
+			error_log(e);
 			diag_clear(diag_get());
 			return;
 		}
+		lsm->stat.upsert.applied++;
 		int64_t upserted_lsn = vy_stmt_lsn(upserted);
 		if (upserted_lsn != lsn) {
 			/**
diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
index 5029bd8a1..060a7f6a9 100644
--- a/src/box/vy_tx.c
+++ b/src/box/vy_tx.c
@@ -515,11 +515,15 @@ vy_tx_write(struct vy_lsm *lsm, struct vy_mem *mem,
 						    region_stmt);
 				tuple_unref(applied);
 				return rc;
+			} else {
+				/*
+				 * Ignore a memory error, because it is
+				 * not critical to apply the optimization.
+				 * Clear diag: otherwise error is set but
+				 * function may return success return code.
+				 */
+				diag_clear(diag_get());
 			}
-			/*
-			 * Ignore a memory error, because it is
-			 * not critical to apply the optimization.
-			 */
 		}
 	} else {
 		/* Invalidate cache element. */
@@ -1047,12 +1051,17 @@ vy_tx_set(struct vy_tx *tx, struct vy_lsm *lsm, struct tuple *stmt)
 
 		applied = vy_apply_upsert(stmt, old->stmt, lsm->cmp_def,
 					  lsm->mem_format, true);
-		lsm->stat.upsert.applied++;
-		if (applied == NULL)
-			return -1;
-		stmt = applied;
-		assert(vy_stmt_type(stmt) != 0);
-		lsm->stat.upsert.squashed++;
+		if (applied == NULL) {
+			struct error *e = diag_last_error(diag_get());
+			assert(e != NULL);
+			error_log(e);
+			diag_clear(diag_get());
+		} else {
+			lsm->stat.upsert.applied++;
+			stmt = applied;
+			assert(vy_stmt_type(stmt) != 0);
+			lsm->stat.upsert.squashed++;
+		}
 	}
 
 	/* Allocate a MVCC container. */
diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
index 7a6a20627..4b5b3e673 100644
--- a/src/box/vy_write_iterator.c
+++ b/src/box/vy_write_iterator.c
@@ -850,10 +850,22 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
 		       vy_stmt_type(hint) != IPROTO_UPSERT);
 		struct tuple *applied = vy_apply_upsert(h->tuple, hint,
 				stream->cmp_def, stream->format, false);
-		if (applied == NULL)
-			return -1;
-		vy_stmt_unref_if_possible(h->tuple);
-		h->tuple = applied;
+		if (applied == NULL) {
+			/*
+			 * Current upsert can't be applied.
+			 * Let's skip it and log error.
+			 */
+			struct error *e = diag_last_error(diag_get());
+			assert(e != NULL);
+			say_info("Upsert %s is not applied due to the error: %s",
+				 vy_stmt_str(h->tuple), e->errmsg);
+			diag_clear(diag_get());
+			vy_stmt_unref_if_possible(h->tuple);
+			h->tuple = hint;
+		} else {
+			vy_stmt_unref_if_possible(h->tuple);
+			h->tuple = applied;
+		}
 	}
 	/* Squash the rest of UPSERTs. */
 	struct vy_write_history *result = h;
@@ -864,10 +876,16 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
 		assert(result->tuple != NULL);
 		struct tuple *applied = vy_apply_upsert(h->tuple, result->tuple,
 					stream->cmp_def, stream->format, false);
-		if (applied == NULL)
-			return -1;
-		vy_stmt_unref_if_possible(result->tuple);
-		result->tuple = applied;
+		if (applied == NULL) {
+			struct error *e = diag_last_error(diag_get());
+			assert(e != NULL);
+			say_info("Upsert %s is not applied due to the error: %s",
+				 vy_stmt_str(h->tuple), e->errmsg);
+			diag_clear(diag_get());
+		} else {
+			vy_stmt_unref_if_possible(result->tuple);
+			result->tuple = applied;
+		}
 		vy_stmt_unref_if_possible(h->tuple);
 		/*
 		 * Don't bother freeing 'h' since it's
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.result b/test/vinyl/gh-1622-skip-invalid-upserts.result
index 437ff3c51..b59ed53dc 100644
--- a/test/vinyl/gh-1622-skip-invalid-upserts.result
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.result
@@ -14,11 +14,120 @@ s:replace{1, 1}
 s:upsert({1, 1}, {{'=', 3, 5}})
  | ---
  | ...
--- Invalid upsert still appears during read.
+-- During read the incorrect upsert is ignored.
 --
 s:select{}
  | ---
- | - error: Tuple field count 3 does not match space field count 2
+ | - - [1, 1]
+ | ...
+
+-- Try to set incorrect field_count in a transaction.
+--
+box.begin()
+ | ---
+ | ...
+s:replace{2, 2}
+ | ---
+ | - [2, 2]
+ | ...
+s:upsert({2, 2}, {{'=', 3, 2}})
+ | ---
+ | ...
+s:select{}
+ | ---
+ | - - [1, 1]
+ |   - [2, 2]
+ | ...
+box.commit()
+ | ---
+ | ...
+s:select{}
+ | ---
+ | - - [1, 1]
+ |   - [2, 2]
+ | ...
+
+-- Read incorrect upsert from a run: it should be ignored.
+--
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+s:select{}
+ | ---
+ | - - [1, 1]
+ |   - [2, 2]
+ | ...
+s:upsert({2, 2}, {{'=', 3, 20}})
+ | ---
+ | ...
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+s:select{}
+ | ---
+ | - - [1, 1]
+ |   - [2, 2]
+ | ...
+
+-- Execute replace/delete after invalid upsert.
+--
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+s:upsert({2, 2}, {{'=', 3, 30}})
+ | ---
+ | ...
+s:replace{2, 3}
+ | ---
+ | - [2, 3]
+ | ...
+s:select{}
+ | ---
+ | - - [1, 1]
+ |   - [2, 3]
+ | ...
+
+s:upsert({1, 1}, {{'=', 3, 30}})
+ | ---
+ | ...
+s:delete{1}
+ | ---
+ | ...
+s:select{}
+ | ---
+ | - - [2, 3]
+ | ...
+
+-- Invalid upsert in a sequence of upserts is skipped meanwhile
+-- the rest are applied.
+--
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+s:upsert({2, 2}, {{'+', 2, 5}})
+ | ---
+ | ...
+s:upsert({2, 2}, {{'=', 3, 40}})
+ | ---
+ | ...
+s:upsert({2, 2}, {{'+', 2, 5}})
+ | ---
+ | ...
+s:select{}
+ | ---
+ | - - [2, 13]
+ | ...
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+s:select{}
+ | ---
+ | - - [2, 13]
  | ...
 
 s:drop()
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
index 952d2bcde..eb93393be 100644
--- a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
@@ -4,8 +4,47 @@ s:replace{1, 1}
 -- Error is logged, upsert is not applied.
 --
 s:upsert({1, 1}, {{'=', 3, 5}})
--- Invalid upsert still appears during read.
+-- During read the incorrect upsert is ignored.
 --
 s:select{}
 
+-- Try to set incorrect field_count in a transaction.
+--
+box.begin()
+s:replace{2, 2}
+s:upsert({2, 2}, {{'=', 3, 2}})
+s:select{}
+box.commit()
+s:select{}
+
+-- Read incorrect upsert from a run: it should be ignored.
+--
+box.snapshot()
+s:select{}
+s:upsert({2, 2}, {{'=', 3, 20}})
+box.snapshot()
+s:select{}
+
+-- Execute replace/delete after invalid upsert.
+--
+box.snapshot()
+s:upsert({2, 2}, {{'=', 3, 30}})
+s:replace{2, 3}
+s:select{}
+
+s:upsert({1, 1}, {{'=', 3, 30}})
+s:delete{1}
+s:select{}
+
+-- Invalid upsert in a sequence of upserts is skipped meanwhile
+-- the rest are applied.
+--
+box.snapshot()
+s:upsert({2, 2}, {{'+', 2, 5}})
+s:upsert({2, 2}, {{'=', 3, 40}})
+s:upsert({2, 2}, {{'+', 2, 5}})
+s:select{}
+box.snapshot()
+s:select{}
+
 s:drop()
\ No newline at end of file
-- 
2.17.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash Nikita Pettik
@ 2020-04-13 22:12   ` Konstantin Osipov
  2020-05-14  2:11     ` Nikita Pettik
  2020-05-01  0:31   ` Vladislav Shpilevoy
  1 sibling, 1 reply; 29+ messages in thread
From: Konstantin Osipov @ 2020-04-13 22:12 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy

* Nikita Pettik <korablev@tarantool.org> [20/04/14 00:57]:
> Instead of aborting merge sequence of upserts let's log appeared
> errors and skip upserts which can't be applied. It makes sense
> taking into consideration previous commit: now upsert statements which
> can't be applied may appear in mems/runs (previously squash operation
> might fail only due to OOM). As a result, if we didn't ignore invalid
> upserts, dump or compaction processes would not be able to finish (owing
> to inability to squash upserts).

We should log the upsert itself, base64 encoded. 

Second, vy_history_apply may be called from many contexts - reads
and writes, for example. We should only log and skip during
compaction, not when reading, otherwise we'll spam the log file at
least.

Finally, let's double check there are no issues with the used
format - can it become obsolete by the time it's used, e.g. if
there is an online/non-blocking schema change that happened in tx
thread (compaction is running in the write thread)?

> 
> Closes #1622
> ---
>  src/box/vy_history.c                          |  20 +++-
>  src/box/vy_lsm.c                              |  13 +-
>  src/box/vy_tx.c                               |  29 +++--
>  src/box/vy_write_iterator.c                   |  34 ++++--
>  .../vinyl/gh-1622-skip-invalid-upserts.result | 113 +++++++++++++++++-
>  .../gh-1622-skip-invalid-upserts.test.lua     |  41 ++++++-
>  6 files changed, 220 insertions(+), 30 deletions(-)
> 
> diff --git a/src/box/vy_history.c b/src/box/vy_history.c
> index 0f3b71195..55eb3c669 100644
> --- a/src/box/vy_history.c
> +++ b/src/box/vy_history.c
> @@ -102,12 +102,20 @@ vy_history_apply(struct vy_history *history, struct key_def *cmp_def,
>  	while (node != NULL) {
>  		struct tuple *stmt = vy_apply_upsert(node->stmt, curr_stmt,
>  						     cmp_def, format, true);
> -		++*upserts_applied;
> -		if (curr_stmt != NULL)
> -			tuple_unref(curr_stmt);
> -		if (stmt == NULL)
> -			return -1;
> -		curr_stmt = stmt;
> +		if (stmt == NULL) {
> +			/*
> +			 * In case statement hasn't been applied,
> +			 * simply skip it ignoring errors (otherwise
> +			 * error will appear during tuple read).
> +			 */
> +			assert(diag_last_error(diag_get()) != NULL);
> +			diag_clear(diag_get());
> +		} else {
> +			++*upserts_applied;
> +			if (curr_stmt != NULL)
> +				tuple_unref(curr_stmt);
> +			curr_stmt = stmt;
> +		}
>  		node = rlist_prev_entry_safe(node, &history->stmts, link);
>  	}
>  	*ret = curr_stmt;
> diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
> index 04c9926a8..3d630fc01 100644
> --- a/src/box/vy_lsm.c
> +++ b/src/box/vy_lsm.c
> @@ -984,13 +984,20 @@ vy_lsm_commit_upsert(struct vy_lsm *lsm, struct vy_mem *mem,
>  		struct tuple *upserted =
>  			vy_apply_upsert(stmt, older, lsm->cmp_def,
>  					lsm->mem_format, false);
> -		lsm->stat.upsert.applied++;
> -
>  		if (upserted == NULL) {
> -			/* OOM */
> +			/*
> +			 * OOM or upserted tuple does not fulfill
> +			 * space format. Hence, upsert can't be
> +			 * transformed into replace. Log error
> +			 * and exit.
> +			 */
> +			struct error *e = diag_last_error(diag_get());
> +			assert(e != NULL);
> +			error_log(e);
>  			diag_clear(diag_get());
>  			return;
>  		}
> +		lsm->stat.upsert.applied++;
>  		int64_t upserted_lsn = vy_stmt_lsn(upserted);
>  		if (upserted_lsn != lsn) {
>  			/**
> diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
> index 5029bd8a1..060a7f6a9 100644
> --- a/src/box/vy_tx.c
> +++ b/src/box/vy_tx.c
> @@ -515,11 +515,15 @@ vy_tx_write(struct vy_lsm *lsm, struct vy_mem *mem,
>  						    region_stmt);
>  				tuple_unref(applied);
>  				return rc;
> +			} else {
> +				/*
> +				 * Ignore a memory error, because it is
> +				 * not critical to apply the optimization.
> +				 * Clear diag: otherwise error is set but
> +				 * function may return success return code.
> +				 */
> +				diag_clear(diag_get());
>  			}
> -			/*
> -			 * Ignore a memory error, because it is
> -			 * not critical to apply the optimization.
> -			 */
>  		}
>  	} else {
>  		/* Invalidate cache element. */
> @@ -1047,12 +1051,17 @@ vy_tx_set(struct vy_tx *tx, struct vy_lsm *lsm, struct tuple *stmt)
>  
>  		applied = vy_apply_upsert(stmt, old->stmt, lsm->cmp_def,
>  					  lsm->mem_format, true);
> -		lsm->stat.upsert.applied++;
> -		if (applied == NULL)
> -			return -1;
> -		stmt = applied;
> -		assert(vy_stmt_type(stmt) != 0);
> -		lsm->stat.upsert.squashed++;
> +		if (applied == NULL) {
> +			struct error *e = diag_last_error(diag_get());
> +			assert(e != NULL);
> +			error_log(e);
> +			diag_clear(diag_get());
> +		} else {
> +			lsm->stat.upsert.applied++;
> +			stmt = applied;
> +			assert(vy_stmt_type(stmt) != 0);
> +			lsm->stat.upsert.squashed++;
> +		}
>  	}
>  
>  	/* Allocate a MVCC container. */
> diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
> index 7a6a20627..4b5b3e673 100644
> --- a/src/box/vy_write_iterator.c
> +++ b/src/box/vy_write_iterator.c
> @@ -850,10 +850,22 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
>  		       vy_stmt_type(hint) != IPROTO_UPSERT);
>  		struct tuple *applied = vy_apply_upsert(h->tuple, hint,
>  				stream->cmp_def, stream->format, false);
> -		if (applied == NULL)
> -			return -1;
> -		vy_stmt_unref_if_possible(h->tuple);
> -		h->tuple = applied;
> +		if (applied == NULL) {
> +			/*
> +			 * Current upsert can't be applied.
> +			 * Let's skip it and log error.
> +			 */
> +			struct error *e = diag_last_error(diag_get());
> +			assert(e != NULL);
> +			say_info("Upsert %s is not applied due to the error: %s",
> +				 vy_stmt_str(h->tuple), e->errmsg);
> +			diag_clear(diag_get());
> +			vy_stmt_unref_if_possible(h->tuple);
> +			h->tuple = hint;
> +		} else {
> +			vy_stmt_unref_if_possible(h->tuple);
> +			h->tuple = applied;
> +		}
>  	}
>  	/* Squash the rest of UPSERTs. */
>  	struct vy_write_history *result = h;
> @@ -864,10 +876,16 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
>  		assert(result->tuple != NULL);
>  		struct tuple *applied = vy_apply_upsert(h->tuple, result->tuple,
>  					stream->cmp_def, stream->format, false);
> -		if (applied == NULL)
> -			return -1;
> -		vy_stmt_unref_if_possible(result->tuple);
> -		result->tuple = applied;
> +		if (applied == NULL) {
> +			struct error *e = diag_last_error(diag_get());
> +			assert(e != NULL);
> +			say_info("Upsert %s is not applied due to the error: %s",
> +				 vy_stmt_str(h->tuple), e->errmsg);
> +			diag_clear(diag_get());
> +		} else {
> +			vy_stmt_unref_if_possible(result->tuple);
> +			result->tuple = applied;
> +		}
>  		vy_stmt_unref_if_possible(h->tuple);
>  		/*
>  		 * Don't bother freeing 'h' since it's
> diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.result b/test/vinyl/gh-1622-skip-invalid-upserts.result
> index 437ff3c51..b59ed53dc 100644
> --- a/test/vinyl/gh-1622-skip-invalid-upserts.result
> +++ b/test/vinyl/gh-1622-skip-invalid-upserts.result
> @@ -14,11 +14,120 @@ s:replace{1, 1}
>  s:upsert({1, 1}, {{'=', 3, 5}})
>   | ---
>   | ...
> --- Invalid upsert still appears during read.
> +-- During read the incorrect upsert is ignored.
>  --
>  s:select{}
>   | ---
> - | - error: Tuple field count 3 does not match space field count 2
> + | - - [1, 1]
> + | ...
> +
> +-- Try to set incorrect field_count in a transaction.
> +--
> +box.begin()
> + | ---
> + | ...
> +s:replace{2, 2}
> + | ---
> + | - [2, 2]
> + | ...
> +s:upsert({2, 2}, {{'=', 3, 2}})
> + | ---
> + | ...
> +s:select{}
> + | ---
> + | - - [1, 1]
> + |   - [2, 2]
> + | ...
> +box.commit()
> + | ---
> + | ...
> +s:select{}
> + | ---
> + | - - [1, 1]
> + |   - [2, 2]
> + | ...
> +
> +-- Read incorrect upsert from a run: it should be ignored.
> +--
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +s:select{}
> + | ---
> + | - - [1, 1]
> + |   - [2, 2]
> + | ...
> +s:upsert({2, 2}, {{'=', 3, 20}})
> + | ---
> + | ...
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +s:select{}
> + | ---
> + | - - [1, 1]
> + |   - [2, 2]
> + | ...
> +
> +-- Execute replace/delete after invalid upsert.
> +--
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +s:upsert({2, 2}, {{'=', 3, 30}})
> + | ---
> + | ...
> +s:replace{2, 3}
> + | ---
> + | - [2, 3]
> + | ...
> +s:select{}
> + | ---
> + | - - [1, 1]
> + |   - [2, 3]
> + | ...
> +
> +s:upsert({1, 1}, {{'=', 3, 30}})
> + | ---
> + | ...
> +s:delete{1}
> + | ---
> + | ...
> +s:select{}
> + | ---
> + | - - [2, 3]
> + | ...
> +
> +-- Invalid upsert in a sequence of upserts is skipped meanwhile
> +-- the rest are applied.
> +--
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +s:upsert({2, 2}, {{'+', 2, 5}})
> + | ---
> + | ...
> +s:upsert({2, 2}, {{'=', 3, 40}})
> + | ---
> + | ...
> +s:upsert({2, 2}, {{'+', 2, 5}})
> + | ---
> + | ...
> +s:select{}
> + | ---
> + | - - [2, 13]
> + | ...
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +s:select{}
> + | ---
> + | - - [2, 13]
>   | ...
>  
>  s:drop()
> diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
> index 952d2bcde..eb93393be 100644
> --- a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
> +++ b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
> @@ -4,8 +4,47 @@ s:replace{1, 1}
>  -- Error is logged, upsert is not applied.
>  --
>  s:upsert({1, 1}, {{'=', 3, 5}})
> --- Invalid upsert still appears during read.
> +-- During read the incorrect upsert is ignored.
>  --
>  s:select{}
>  
> +-- Try to set incorrect field_count in a transaction.
> +--
> +box.begin()
> +s:replace{2, 2}
> +s:upsert({2, 2}, {{'=', 3, 2}})
> +s:select{}
> +box.commit()
> +s:select{}
> +
> +-- Read incorrect upsert from a run: it should be ignored.
> +--
> +box.snapshot()
> +s:select{}
> +s:upsert({2, 2}, {{'=', 3, 20}})
> +box.snapshot()
> +s:select{}
> +
> +-- Execute replace/delete after invalid upsert.
> +--
> +box.snapshot()
> +s:upsert({2, 2}, {{'=', 3, 30}})
> +s:replace{2, 3}
> +s:select{}
> +
> +s:upsert({1, 1}, {{'=', 3, 30}})
> +s:delete{1}
> +s:select{}
> +
> +-- Invalid upsert in a sequence of upserts is skipped meanwhile
> +-- the rest are applied.
> +--
> +box.snapshot()
> +s:upsert({2, 2}, {{'+', 2, 5}})
> +s:upsert({2, 2}, {{'=', 3, 40}})
> +s:upsert({2, 2}, {{'+', 2, 5}})
> +s:select{}
> +box.snapshot()
> +s:select{}
> +
>  s:drop()
> \ No newline at end of file
> -- 
> 2.17.1
> 

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash Nikita Pettik
  2020-04-13 22:12   ` Konstantin Osipov
@ 2020-05-01  0:31   ` Vladislav Shpilevoy
  2020-05-14  2:21     ` Nikita Pettik
  2020-05-26 21:33     ` Vladislav Shpilevoy
  1 sibling, 2 replies; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-05-01  0:31 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

Hi! Thanks for the patch!

Firstly, Kostja left some comments here. Would be cool to address them.

Secondly, here is my personal opinion. I don't like just skipping things
a user committed without any error appearing in the application. IMO, we
should apply only the first commit. And let a user see this error so as he
could notice the problem. To fix reads he could do delete() of the bad key.

However, how a user will be able to find the exact broken key - I don't
know. Maybe the ignore + logging is better.

On 13/04/2020 23:55, Nikita Pettik wrote:
> Instead of aborting merge sequence of upserts let's log appeared
> errors and skip upserts which can't be applied. It makes sense
> taking into consideration previous commit: now upsert statements which
> can't be applied may appear in mems/runs (previously squash operation
> might fail only due to OOM). As a result, if we didn't ignore invalid
> upserts, dump or compaction processes would not be able to finish (owing
> to inability to squash upserts).
> 
> Closes #1622
> ---
> diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
> index 5029bd8a1..060a7f6a9 100644
> --- a/src/box/vy_tx.c
> +++ b/src/box/vy_tx.c
> @@ -515,11 +515,15 @@ vy_tx_write(struct vy_lsm *lsm, struct vy_mem *mem,
>  						    region_stmt);
>  				tuple_unref(applied);
>  				return rc;
> +			} else {
> +				/*
> +				 * Ignore a memory error, because it is
> +				 * not critical to apply the optimization.
> +				 * Clear diag: otherwise error is set but
> +				 * function may return success return code.
> +				 */
> +				diag_clear(diag_get());

Why do you clear it? Diagnostics area is usually not cleared (at least
in application code), and contains some last happened error. In C code we
anyway use result value of a function to determine its result.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-04-13 22:12   ` Konstantin Osipov
@ 2020-05-14  2:11     ` Nikita Pettik
  2020-05-14  6:56       ` Konstantin Osipov
  0 siblings, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-05-14  2:11 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches, v.shpilevoy

On 14 Apr 01:12, Konstantin Osipov wrote:
> * Nikita Pettik <korablev@tarantool.org> [20/04/14 00:57]:
> > Instead of aborting merge sequence of upserts let's log appeared
> > errors and skip upserts which can't be applied. It makes sense
> > taking into consideration previous commit: now upsert statements which
> > can't be applied may appear in mems/runs (previously squash operation
> > might fail only due to OOM). As a result, if we didn't ignore invalid
> > upserts, dump or compaction processes would not be able to finish (owing
> > to inability to squash upserts).
> 
> We should log the upsert itself, base64 encoded. 

It is logged with following format (in vy_read_view_merge()):

say_info("upsert %s is not applied due to the error: %s",
         vy_stmt_str(h->tuple), e->errmsg);
         
It is the function which is called during compaction to squash
upserts.

Also, I've added logs during squash/apply of upserts in in-memory
tree: see diffs in vy_tx_write() and vy_tx_set(). So that user gets
the same error as when executing invalid upserts in memtx. However,
I've found these logs a bit inconsistent: upserts are not always
squashed/applied once they are inserted in tree. For instance:

s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
pk = s:create_index('pk')
s:replace{1, 1}
s:upsert({1, 1}, {{'=', 3, 5}})
main/101/interactive tuple.c:150 E> ER_EXACT_FIELD_COUNT: Tuple field count 3 does not match space field count 2 

This upsert is attepted to be optimized: in vy_lsm_commit_upsert()
we try to apply it on replace statement, but we fail so error is logged
(still upsert has been inserted in tree). Now, if we execute this
upsert once again:

s:upsert({1, 1}, {{'=', 3, 5}})

error won't be logged, since previous upsert now resides in tree
and we won't attempt at applying it (according to optimization's conditions).

Hence now I'm not sure that logging failed upsert squashes/applications
is worth it when it comes for in-memory lsm level.

> Second, vy_history_apply may be called from many contexts - reads
> and writes, for example. We should only log and skip during
> compaction, not when reading, otherwise we'll spam the log file at
> least.

We don't log anything in vy_history_apply(). So there are no logs
during reads.

> Finally, let's double check there are no issues with the used
> format - can it become obsolete by the time it's used, e.g. if
> there is an online/non-blocking schema change that happened in tx
> thread (compaction is running in the write thread)?

Upserts are supported only by primary index. Meanwhile vinyl does
not support altering primary index of non-empty space. Am I missing
something?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-01  0:31   ` Vladislav Shpilevoy
@ 2020-05-14  2:21     ` Nikita Pettik
  2020-05-14 21:32       ` Vladislav Shpilevoy
  2020-05-26 21:33     ` Vladislav Shpilevoy
  1 sibling, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-05-14  2:21 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 01 May 02:31, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
> 
> Firstly, Kostja left some comments here. Would be cool to address them.

Done (sorry, I did not ignore them, just had to work on other more vital bugs).
 
> Secondly, here is my personal opinion. I don't like just skipping things
> a user committed without any error appearing in the application. IMO, we
> should apply only the first commit. And let a user see this error so as he
> could notice the problem. To fix reads he could do delete() of the bad key.

The problem with delete it leaves user no way to restore the rest
of upsert history. Moreover, these upserts will get stuck until
user finds in logs corresponding error (I guess we can't abort
compaction due to invalid upserts).

> However, how a user will be able to find the exact broken key - I don't
> know. Maybe the ignore + logging is better.

Why can't we just log broken key? E.g. see logs in vy_apply_upsert().

> On 13/04/2020 23:55, Nikita Pettik wrote:
> > Instead of aborting merge sequence of upserts let's log appeared
> > errors and skip upserts which can't be applied. It makes sense
> > taking into consideration previous commit: now upsert statements which
> > can't be applied may appear in mems/runs (previously squash operation
> > might fail only due to OOM). As a result, if we didn't ignore invalid
> > upserts, dump or compaction processes would not be able to finish (owing
> > to inability to squash upserts).
> > 
> > Closes #1622
> > ---
> > diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
> > index 5029bd8a1..060a7f6a9 100644
> > --- a/src/box/vy_tx.c
> > +++ b/src/box/vy_tx.c
> > @@ -515,11 +515,15 @@ vy_tx_write(struct vy_lsm *lsm, struct vy_mem *mem,
> >  						    region_stmt);
> >  				tuple_unref(applied);
> >  				return rc;
> > +			} else {
> > +				/*
> > +				 * Ignore a memory error, because it is
> > +				 * not critical to apply the optimization.
> > +				 * Clear diag: otherwise error is set but
> > +				 * function may return success return code.
> > +				 */
> > +				diag_clear(diag_get());
> 
> Why do you clear it? Diagnostics area is usually not cleared (at least
> in application code), and contains some last happened error. In C code we
> anyway use result value of a function to determine its result.

Agree, forgot that we do not erase diag before each request execution.
Removed this clean-up.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-14  2:11     ` Nikita Pettik
@ 2020-05-14  6:56       ` Konstantin Osipov
  2020-05-19 19:10         ` Nikita Pettik
  0 siblings, 1 reply; 29+ messages in thread
From: Konstantin Osipov @ 2020-05-14  6:56 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy

* Nikita Pettik <korablev@tarantool.org> [20/05/14 09:50]:
> On 14 Apr 01:12, Konstantin Osipov wrote:
> > * Nikita Pettik <korablev@tarantool.org> [20/04/14 00:57]:
> > > Instead of aborting merge sequence of upserts let's log appeared
> > > errors and skip upserts which can't be applied. It makes sense
> > > taking into consideration previous commit: now upsert statements which
> > > can't be applied may appear in mems/runs (previously squash operation
> > > might fail only due to OOM). As a result, if we didn't ignore invalid
> > > upserts, dump or compaction processes would not be able to finish (owing
> > > to inability to squash upserts).
> > 
> > We should log the upsert itself, base64 encoded. 
> 
> It is logged with following format (in vy_read_view_merge()):
> 
> say_info("upsert %s is not applied due to the error: %s",
>          vy_stmt_str(h->tuple), e->errmsg);
>          
> It is the function which is called during compaction to squash
> upserts.
> 
> Also, I've added logs during squash/apply of upserts in in-memory
> tree: see diffs in vy_tx_write() and vy_tx_set(). So that user gets
> the same error as when executing invalid upserts in memtx. However,
> I've found these logs a bit inconsistent: upserts are not always
> squashed/applied once they are inserted in tree. For instance:
> 
> s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
> pk = s:create_index('pk')
> s:replace{1, 1}
> s:upsert({1, 1}, {{'=', 3, 5}})
> main/101/interactive tuple.c:150 E> ER_EXACT_FIELD_COUNT: Tuple field count 3 does not match space field count 2 
> 
> This upsert is attepted to be optimized: in vy_lsm_commit_upsert()
> we try to apply it on replace statement, but we fail so error is logged
> (still upsert has been inserted in tree). Now, if we execute this
> upsert once again:
> 
> s:upsert({1, 1}, {{'=', 3, 5}})
> 
> error won't be logged, since previous upsert now resides in tree
> and we won't attempt at applying it (according to optimization's conditions).
> 
> Hence now I'm not sure that logging failed upsert squashes/applications
> is worth it when it comes for in-memory lsm level.

This is fine, the correct behaviour is not possible to define anyway, so best
effort on tarantool part is good enough.

If it becomes an issue we could create a local space which will
receive these upserts, instead of a log file, something like
_vy_failed_upserts, but I'd not expect it to be very useful.

> > Second, vy_history_apply may be called from many contexts - reads
> > and writes, for example. We should only log and skip during
> > compaction, not when reading, otherwise we'll spam the log file at
> > least.
> 
> We don't log anything in vy_history_apply(). So there are no logs
> during reads.

OK

> > Finally, let's double check there are no issues with the used
> > format - can it become obsolete by the time it's used, e.g. if
> > there is an online/non-blocking schema change that happened in tx
> > thread (compaction is running in the write thread)?
> 
> Upserts are supported only by primary index. Meanwhile vinyl does
> not support altering primary index of non-empty space. Am I missing
> something?

I mean the space format object itself. Squashing is happening in a
different thread, the 
> 

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-14  2:21     ` Nikita Pettik
@ 2020-05-14 21:32       ` Vladislav Shpilevoy
  2020-05-19 18:18         ` Nikita Pettik
  0 siblings, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-05-14 21:32 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

Thanks for the fixes!

On 14/05/2020 04:21, Nikita Pettik wrote:
> On 01 May 02:31, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the patch!
>>
>> Firstly, Kostja left some comments here. Would be cool to address them.
> 
> Done (sorry, I did not ignore them, just had to work on other more vital bugs).
>  
>> Secondly, here is my personal opinion. I don't like just skipping things
>> a user committed without any error appearing in the application. IMO, we
>> should apply only the first commit. And let a user see this error so as he
>> could notice the problem. To fix reads he could do delete() of the bad key.
> 
> The problem with delete it leaves user no way to restore the rest
> of upsert history. Moreover, these upserts will get stuck until
> user finds in logs corresponding error (I guess we can't abort
> compaction due to invalid upserts).
> 
>> However, how a user will be able to find the exact broken key - I don't
>> know. Maybe the ignore + logging is better.
> 
> Why can't we just log broken key? E.g. see logs in vy_apply_upsert().

We can log it. This is what you do in this patchset.

I also noticed, that you skip the failed upsert in case of any error.
Even if it is an OOM, not related to format problems. I think it is safer
to check error type, and skip it only in case of a ClientError.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-14 21:32       ` Vladislav Shpilevoy
@ 2020-05-19 18:18         ` Nikita Pettik
  2020-05-20 22:13           ` Vladislav Shpilevoy
  0 siblings, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-05-19 18:18 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 14 May 23:32, Vladislav Shpilevoy wrote:
> Thanks for the fixes!
> 
> On 14/05/2020 04:21, Nikita Pettik wrote:
> > On 01 May 02:31, Vladislav Shpilevoy wrote:
> >> Hi! Thanks for the patch!
> >>
> >> Firstly, Kostja left some comments here. Would be cool to address them.
> > 
> > Done (sorry, I did not ignore them, just had to work on other more vital bugs).
> >  
> >> Secondly, here is my personal opinion. I don't like just skipping things
> >> a user committed without any error appearing in the application. IMO, we
> >> should apply only the first commit. And let a user see this error so as he
> >> could notice the problem. To fix reads he could do delete() of the bad key.
> > 
> > The problem with delete it leaves user no way to restore the rest
> > of upsert history. Moreover, these upserts will get stuck until
> > user finds in logs corresponding error (I guess we can't abort
> > compaction due to invalid upserts).
> > 
> >> However, how a user will be able to find the exact broken key - I don't
> >> know. Maybe the ignore + logging is better.
> > 
> > Why can't we just log broken key? E.g. see logs in vy_apply_upsert().
> 
> We can log it. This is what you do in this patchset.
> 
> I also noticed, that you skip the failed upsert in case of any error.
> Even if it is an OOM, not related to format problems. I think it is safer
> to check error type, and skip it only in case of a ClientError.

Personally I don't like relying on error type, but this pattern is
already in the source code (e.g. upsert_do_ops), so here's diff:

diff --git a/src/box/vy_history.c b/src/box/vy_history.c
index 8c4719a85..2232348f9 100644
--- a/src/box/vy_history.c
+++ b/src/box/vy_history.c
@@ -109,6 +109,9 @@ vy_history_apply(struct vy_history *history, struct key_def *cmp_def,
                         * error will appear during tuple read).
                         */
                        assert(diag_last_error(diag_get()) != NULL);
+                       struct error *e = diag_last_error(diag_get());
+                       if (e->type != &type_ClientError)
+                               return -1;
                } else {
                        ++*upserts_applied;
                        if (curr_stmt != NULL)
diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
index aefad5942..1a2ea43d9 100644
--- a/src/box/vy_tx.c
+++ b/src/box/vy_tx.c
@@ -1050,6 +1050,8 @@ vy_tx_set(struct vy_tx *tx, struct vy_lsm *lsm, struct tuple *stmt)
                if (applied == NULL) {
                        struct error *e = diag_last_error(diag_get());
                        assert(e != NULL);
+                       if (e->type != &type_ClientError)
+                               return -1;
                        error_log(e);
                } else {
                        lsm->stat.upsert.applied++;
diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
index 13d5bbede..4d34d96c8 100644
--- a/src/box/vy_write_iterator.c
+++ b/src/box/vy_write_iterator.c
@@ -857,6 +857,8 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
                         */
                        struct error *e = diag_last_error(diag_get());
                        assert(e != NULL);
+                       if (e->type != &type_ClientError)
+                               return -1;
                        say_info("upsert %s is not applied due to the error: %s",
                                 vy_stmt_str(h->tuple), e->errmsg);
                        vy_stmt_unref_if_possible(h->tuple);
@@ -878,6 +880,8 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
                if (applied == NULL) {
                        struct error *e = diag_last_error(diag_get());
                        assert(e != NULL);
+                       if (e->type != &type_ClientError)
+                               return -1;
                        say_info("upsert %s is not applied due to the error: %s",
                                 vy_stmt_str(h->tuple), e->errmsg);
                } else {

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-14  6:56       ` Konstantin Osipov
@ 2020-05-19 19:10         ` Nikita Pettik
  2020-05-19 19:39           ` Konstantin Osipov
  0 siblings, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-05-19 19:10 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches, v.shpilevoy

On 14 May 09:56, Konstantin Osipov wrote:
> * Nikita Pettik <korablev@tarantool.org> [20/05/14 09:50]:
> > On 14 Apr 01:12, Konstantin Osipov wrote:
> > > * Nikita Pettik <korablev@tarantool.org> [20/04/14 00:57]:
> 
> > > Finally, let's double check there are no issues with the used
> > > format - can it become obsolete by the time it's used, e.g. if
> > > there is an online/non-blocking schema change that happened in tx
> > > thread (compaction is running in the write thread)?
> > 
> > Upserts are supported only by primary index. Meanwhile vinyl does
> > not support altering primary index of non-empty space. Am I missing
> > something?
> 
> I mean the space format object itself. Squashing is happening in a
> different thread, the 
> > 

Seems like you've cut sentence. Could you please suggest exact
test scenario? We already have similar test in vinyl/errinj.test.lua:

...

errinj.set("ERRINJ_VY_SQUASH_TIMEOUT", 0.050)
s = box.schema.space.create('test', {engine='vinyl'})
_ = s:create_index('pk')
s:insert{0, 0}
box.snapshot()
for i=1,256 do s:upsert({0, 0}, {{'+', 2, 1}}) end
box.snapshot() -- in-memory tree is gone
fiber.sleep(0.05)
s:select()
 
> -- 
> Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-19 19:10         ` Nikita Pettik
@ 2020-05-19 19:39           ` Konstantin Osipov
  2020-05-21  2:51             ` Nikita Pettik
  0 siblings, 1 reply; 29+ messages in thread
From: Konstantin Osipov @ 2020-05-19 19:39 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy

* Nikita Pettik <korablev@tarantool.org> [20/05/19 22:10]:
> On 14 May 09:56, Konstantin Osipov wrote:
> > * Nikita Pettik <korablev@tarantool.org> [20/05/14 09:50]:
> > > On 14 Apr 01:12, Konstantin Osipov wrote:
> > > > * Nikita Pettik <korablev@tarantool.org> [20/04/14 00:57]:
> > 
> > > > Finally, let's double check there are no issues with the used
> > > > format - can it become obsolete by the time it's used, e.g. if
> > > > there is an online/non-blocking schema change that happened in tx
> > > > thread (compaction is running in the write thread)?
> > > 
> > > Upserts are supported only by primary index. Meanwhile vinyl does
> > > not support altering primary index of non-empty space. Am I missing
> > > something?
> > 
> > I mean the space format object itself. Squashing is happening in a
> > different thread, the 
> > > 
> 
> Seems like you've cut sentence. Could you please suggest exact
> test scenario? We already have similar test in vinyl/errinj.test.lua:

1. Memtable dump is started.
2. Space is altered.
3. The format object is gone.
4. Memtable dump uses the old format object for upsert
   squashing.

Similar with compaction:
1. Compaction starts and uses the existing format objects
2. It uses the format object for upsert squashing
3. Space is altered and format is changed
4. We report a squash failure/suppress failure.


This is both related to memory liveness issues (e.g. we may use a
freed format object, which isn't the case because we refcount them
at start of dump/compaction), but also correctness issues: we
accept data which is no longer acceptable according to the new
format, or report/skip data which shouldn't be skipped.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-19 18:18         ` Nikita Pettik
@ 2020-05-20 22:13           ` Vladislav Shpilevoy
  0 siblings, 0 replies; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-05-20 22:13 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

Thanks for the fixes!

>> On 14/05/2020 04:21, Nikita Pettik wrote:
>>> On 01 May 02:31, Vladislav Shpilevoy wrote:
>>>> Hi! Thanks for the patch!
>>>>
>>>> Firstly, Kostja left some comments here. Would be cool to address them.
>>>
>>> Done (sorry, I did not ignore them, just had to work on other more vital bugs).
>>>  
>>>> Secondly, here is my personal opinion. I don't like just skipping things
>>>> a user committed without any error appearing in the application. IMO, we
>>>> should apply only the first commit. And let a user see this error so as he
>>>> could notice the problem. To fix reads he could do delete() of the bad key.
>>>
>>> The problem with delete it leaves user no way to restore the rest
>>> of upsert history. Moreover, these upserts will get stuck until
>>> user finds in logs corresponding error (I guess we can't abort
>>> compaction due to invalid upserts).
>>>
>>>> However, how a user will be able to find the exact broken key - I don't
>>>> know. Maybe the ignore + logging is better.
>>>
>>> Why can't we just log broken key? E.g. see logs in vy_apply_upsert().
>>
>> We can log it. This is what you do in this patchset.
>>
>> I also noticed, that you skip the failed upsert in case of any error.
>> Even if it is an OOM, not related to format problems. I think it is safer
>> to check error type, and skip it only in case of a ClientError.
> 
> Personally I don't like relying on error type, but this pattern is
> already in the source code (e.g. upsert_do_ops), so here's diff:

Yeah, I agree. Error type is rather a crutch. And should be used only
in userspace code. As an alternative, you can add an out parameter
to vy_apply_upsert(), 'is_format_invalid' or something like this,
and set it to true in case of tuple_validate_raw() error. Set to false
for any other errors.

However error type solution also looks ok, so up to you.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-04-13 21:55 [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Nikita Pettik
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied Nikita Pettik
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash Nikita Pettik
@ 2020-05-20 22:13 ` Vladislav Shpilevoy
  2020-05-22  2:42   ` Nikita Pettik
  2020-06-02 21:36 ` Vladislav Shpilevoy
  3 siblings, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-05-20 22:13 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

Thanks for the patchset!

I grepped for other vy_apply_upsert() usage cases and found these:

- vy_build_recover_stmt() - should we patch it too?

- vy_tx_write() - here the upsert is skipped in case of any error, but
  what if it was not a ClientError?

On 13/04/2020 23:55, Nikita Pettik wrote:
> Branch: https://github.com/tarantool/tarantool/tree/np/gh-1622-skip-invalid-upserts
> Issue: https://github.com/tarantool/tarantool/issues/1622
> 
> In fact, the first patch in series does not make much sense without
> second since otherwise dump or compaction processes (and any other
> operations that read data as well) will fail whenever they are started.
> But still I've divided fix into two patches for the sake of review
> simplicity.
> 
> Nikita Pettik (2):
>   vinyl: validate resulting tuple after upsert is applied
>   vinyl: skip invalid upserts during squash
> 
>  src/box/vy_history.c                          |  20 ++-
>  src/box/vy_lsm.c                              |  13 +-
>  src/box/vy_tx.c                               |  29 ++--
>  src/box/vy_upsert.c                           |   4 +
>  src/box/vy_write_iterator.c                   |  34 +++--
>  .../vinyl/gh-1622-skip-invalid-upserts.result | 135 ++++++++++++++++++
>  .../gh-1622-skip-invalid-upserts.test.lua     |  50 +++++++
>  7 files changed, 258 insertions(+), 27 deletions(-)
>  create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.result
>  create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.test.lua
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-19 19:39           ` Konstantin Osipov
@ 2020-05-21  2:51             ` Nikita Pettik
  2020-05-21  8:36               ` Konstantin Osipov
  0 siblings, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-05-21  2:51 UTC (permalink / raw)
  To: Konstantin Osipov, tarantool-patches, v.shpilevoy

On 19 May 22:39, Konstantin Osipov wrote:
> * Nikita Pettik <korablev@tarantool.org> [20/05/19 22:10]:
> > On 14 May 09:56, Konstantin Osipov wrote:
> > > * Nikita Pettik <korablev@tarantool.org> [20/05/14 09:50]:
> > > > On 14 Apr 01:12, Konstantin Osipov wrote:
> > > > > * Nikita Pettik <korablev@tarantool.org> [20/04/14 00:57]:
> > > 
> > > > > Finally, let's double check there are no issues with the used
> > > > > format - can it become obsolete by the time it's used, e.g. if
> > > > > there is an online/non-blocking schema change that happened in tx
> > > > > thread (compaction is running in the write thread)?
> > > > 
> > > > Upserts are supported only by primary index. Meanwhile vinyl does
> > > > not support altering primary index of non-empty space. Am I missing
> > > > something?
> > > 
> > > I mean the space format object itself. Squashing is happening in a
> > > different thread, the 
> > > > 
> > 
> > Seems like you've cut sentence. Could you please suggest exact
> > test scenario? We already have similar test in vinyl/errinj.test.lua:
> 
> 1. Memtable dump is started.
> 2. Space is altered.
> 3. The format object is gone.
> 4. Memtable dump uses the old format object for upsert
>    squashing.

Look, the format is used for upsert squashing is stored in write
iterator: in vy_read_view_merge() it is passed to vy_apply_upsert()
from stream->format. When write iterator is created, format which
is obtained (i.e. lsm->disk_format) is refed. So, as you can see,
it can't be destroyed before write iterator is finished.

Finally, as I already mentioned, lsm used for squashing is PK,
and PK in turn can't be altered (lsm->disk_format will be destroyed
only on pk:drop()).

> Similar with compaction:
> 1. Compaction starts and uses the existing format objects
> 2. It uses the format object for upsert squashing
> 3. Space is altered and format is changed
> 4. We report a squash failure/suppress failure.
> 
> 
> This is both related to memory liveness issues (e.g. we may use a
> freed format object, which isn't the case because we refcount them
> at start of dump/compaction), but also correctness issues: we
> accept data which is no longer acceptable according to the new
> format, or report/skip data which shouldn't be skipped.

Data will always will be acceptable according to the format because
PK format can't change.

> -- 
> Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-21  2:51             ` Nikita Pettik
@ 2020-05-21  8:36               ` Konstantin Osipov
  0 siblings, 0 replies; 29+ messages in thread
From: Konstantin Osipov @ 2020-05-21  8:36 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy

* Nikita Pettik <korablev@tarantool.org> [20/05/21 05:54]:

> Data will always will be acceptable according to the format because
> PK format can't change.

The format used for squashing may be taken from the primary key
lsm, but there are type constraints which impact even non-indexed
fields.

Imagine nullable -> not null change in SQL, and a pending upsert
which assigns NULL to a not nullable field. 

Or addition/removal of is_optional on a field in the format.

We don't allow much there, but it would be good to at least
*consider* what we allow and make sure it's safe.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-05-20 22:13 ` [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Vladislav Shpilevoy
@ 2020-05-22  2:42   ` Nikita Pettik
  2020-05-26 21:33     ` Vladislav Shpilevoy
  2020-06-22 14:13     ` Aleksandr Lyapunov
  0 siblings, 2 replies; 29+ messages in thread
From: Nikita Pettik @ 2020-05-22  2:42 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 10515 bytes --]

On 21 May 00:13, Vladislav Shpilevoy wrote:
> Thanks for the patchset!
> 
> I grepped for other vy_apply_upsert() usage cases and found these:
> 
> - vy_build_recover_stmt() - should we patch it too?

No, I'd say not. If user wants to skip errors and continue recovery
process one can use force_recovery option.

> - vy_tx_write() - here the upsert is skipped in case of any error, but
>   what if it was not a ClientError?

Originally, there was a comment:

/*
 * Ignore a memory error, because it is
 * not critical to apply the optimization.
 */

So I decided to keep this logic. I guess returning -1 in case of OOM
is also OK. So I can add check here as well. Let me know.

Btw, I've found another one bug concerned with upserts. It is revealed
since upserts can turn out to be invalid not only due to OOM.
Here's the patch (see also as an attachment). It is located on the top
of np/gh-1622-skip-invalid-upserts branch.

Author: Nikita Pettik <korablev@tarantool.org>
Date:   Fri May 22 04:38:45 2020 +0300

    vinyl: handle invalid upserts in background squash process
    
    When L0 collects more than certain number of upserts (to be more precise
    VY_UPSERT_THRESHOLD == 128), background process is launched to squash
    them. During this squash, new statements could be inserted into L0.
    Until committed prepared statements feature lsn = MAX_LSN as a special
    marker. It is assumed that all upserts will be successfully squashed. As
    a result statements with given key in L0 will be able to have only
    MAX_LSN lsn (except for the squash result). However, it is not so if at
    least one upsert is failed to be applied: it will get stack in L0 until
    dump and it has normal (i.e. < MAX_LSN) lsn. So the assertion which
    verifies that lsn of upsert in L0 is greater than MAX_LSN is wrong.
    Moreover, if we execute enough (> 128) invalid upserts, they will
    prevent from squashing normal upserts. So let's skip invalid upserts and
    don't account them while calculating number of upserts residing in L0.
    
    Follow-up #1622

diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 55a4c8fb7..b39e3b9d8 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash)
                if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
                    vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
                        break;
-               assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
-               vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
-               if (n_upserts <= VY_UPSERT_THRESHOLD)
-                       ++n_upserts;
+               /*
+                * Upsert was not applied in vy_history_apply(),
+                * but it still resides in tree memory. Ignore
+                * such statements and do not account them while
+                * counting upserts. Otherwise invalid upserts will
+                * get stack and won't allow other upserts to squash.
+                */
+               if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) {
+                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt,
+                                             n_upserts);
+                       if (n_upserts <= VY_UPSERT_THRESHOLD)
+                               ++n_upserts;
+               } else {
+                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0);
+               }
                vy_mem_tree_iterator_prev(&mem->tree, &mem_itr);
        }
 
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.result b/test/vinyl/gh-1622-skip-invalid-upserts.result
index b59ed53dc..12ca65c19 100644
--- a/test/vinyl/gh-1622-skip-invalid-upserts.result
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.result
@@ -130,6 +130,140 @@ s:select{}
  | - - [2, 13]
  | ...
 
+-- Try to insert too many invalid upserts in a row so that
+-- they trigger squashing procedure.
+--
+fiber = require('fiber')
+ | ---
+ | ...
+s:drop()
+ | ---
+ | ...
+
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+ | ---
+ | ...
+pk = s:create_index('pk')
+ | ---
+ | ...
+s:replace({1, 1})
+ | ---
+ | - [1, 1]
+ | ...
+
+squashed_before = pk:stat().upsert.squashed
+ | ---
+ | ...
+applied_before = pk:stat().upsert.applied
+ | ---
+ | ...
+len_before = s:len()
+ | ---
+ | ...
+upsert_cnt = 129
+ | ---
+ | ...
+for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end
+ | ---
+ | ...
+-- 1 squash is accounted in vy_squash_process() unconditionally.
+--
+while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end
+ | ---
+ | ...
+-- Make sure invalid upserts are committed into tree.
+--
+assert(s:len() == len_before + upsert_cnt)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == applied_before)
+ | ---
+ | - true
+ | ...
+-- Adding new invalid upserts will not result in squash until
+-- we reach VY_UPSERT_THRESHOLD (128) limit again.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+ | ---
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+ | ---
+ | - true
+ | ...
+for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end
+ | ---
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+ | ---
+ | - true
+ | ...
+s:upsert({1, 1}, {{'=', 3, 5}})
+ | ---
+ | ...
+while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end
+ | ---
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == applied_before)
+ | ---
+ | - true
+ | ...
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+-- No one upsert is applied so number of rows should not change.
+--
+assert(s:len() == len_before)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == applied_before)
+ | ---
+ | - true
+ | ...
+-- Now let's mix valid and invalid upserts. Make sure that only
+-- valid are accounted during squash/apply.
+--
+for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end
+ | ---
+ | ...
+while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end
+ | ---
+ | ...
+assert(s:len() == len_before + upsert_cnt)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+ | ---
+ | - true
+ | ...
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+assert(s:len() == len_before)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 3)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+ | ---
+ | - true
+ | ...
+
 s:drop()
  | ---
  | ...
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
index eb93393be..27c168fcb 100644
--- a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
@@ -47,4 +47,55 @@ s:select{}
 box.snapshot()
 s:select{}
 
+-- Try to insert too many invalid upserts in a row so that
+-- they trigger squashing procedure.
+--
+fiber = require('fiber')
+s:drop()
+
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+pk = s:create_index('pk')
+s:replace({1, 1})
+
+squashed_before = pk:stat().upsert.squashed
+applied_before = pk:stat().upsert.applied
+len_before = s:len()
+upsert_cnt = 129
+for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end
+-- 1 squash is accounted in vy_squash_process() unconditionally.
+--
+while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end
+-- Make sure invalid upserts are committed into tree.
+--
+assert(s:len() == len_before + upsert_cnt)
+assert(pk:stat().upsert.applied == applied_before)
+-- Adding new invalid upserts will not result in squash until
+-- we reach VY_UPSERT_THRESHOLD (128) limit again.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+s:upsert({1, 1}, {{'=', 3, 5}})
+while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+assert(pk:stat().upsert.applied == applied_before)
+box.snapshot()
+-- No one upsert is applied so number of rows should not change.
+--
+assert(s:len() == len_before)
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+assert(pk:stat().upsert.applied == applied_before)
+-- Now let's mix valid and invalid upserts. Make sure that only
+-- valid are accounted during squash/apply.
+--
+for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end
+while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end
+assert(s:len() == len_before + upsert_cnt)
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+box.snapshot()
+assert(s:len() == len_before)
+assert(pk:stat().upsert.squashed == squashed_before + 3)
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+
 s:drop()
\ No newline at end of file


 
> On 13/04/2020 23:55, Nikita Pettik wrote:
> > Branch: https://github.com/tarantool/tarantool/tree/np/gh-1622-skip-invalid-upserts
> > Issue: https://github.com/tarantool/tarantool/issues/1622
> > 
> > In fact, the first patch in series does not make much sense without
> > second since otherwise dump or compaction processes (and any other
> > operations that read data as well) will fail whenever they are started.
> > But still I've divided fix into two patches for the sake of review
> > simplicity.
> > 
> > Nikita Pettik (2):
> >   vinyl: validate resulting tuple after upsert is applied
> >   vinyl: skip invalid upserts during squash
> > 
> >  src/box/vy_history.c                          |  20 ++-
> >  src/box/vy_lsm.c                              |  13 +-
> >  src/box/vy_tx.c                               |  29 ++--
> >  src/box/vy_upsert.c                           |   4 +
> >  src/box/vy_write_iterator.c                   |  34 +++--
> >  .../vinyl/gh-1622-skip-invalid-upserts.result | 135 ++++++++++++++++++
> >  .../gh-1622-skip-invalid-upserts.test.lua     |  50 +++++++
> >  7 files changed, 258 insertions(+), 27 deletions(-)
> >  create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.result
> >  create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.test.lua
> > 

[-- Attachment #2: 0001-vinyl-handle-invalid-upserts-in-background-squash-pr.patch --]
[-- Type: text/x-diff, Size: 8277 bytes --]

From b4a8349217abe5727047751ce55fe09f620b3669 Mon Sep 17 00:00:00 2001
Message-Id: <b4a8349217abe5727047751ce55fe09f620b3669.1590114943.git.korablev@tarantool.org>
From: Nikita Pettik <korablev@tarantool.org>
Date: Fri, 22 May 2020 04:38:45 +0300
Subject: [PATCH] vinyl: handle invalid upserts in background squash process

When L0 collects more than certain number of upserts (to be more precise
VY_UPSERT_THRESHOLD == 128), background process is launched to squash
them. During this squash, new statements could be inserted into L0.
Until committed prepared statements feature lsn = MAX_LSN as a special
marker. It is assumed that all upserts will be successfully squashed. As
a result statements with given key in L0 will be able to have only
MAX_LSN lsn (except for the squash result). However, it is not so if at
least one upsert is failed to be applied: it will get stack in L0 until
dump and it has normal (i.e. < MAX_LSN) lsn. So the assertion which
verifies that lsn of upsert in L0 is greater than MAX_LSN is wrong.
Moreover, if we execute enough (> 128) invalid upserts, they will
prevent from squashing normal upserts. So let's skip invalid upserts and
don't account them while calculating number of upserts residing in L0.

Follow-up #1622
---
 src/box/vinyl.c                               |  19 ++-
 .../vinyl/gh-1622-skip-invalid-upserts.result | 134 ++++++++++++++++++
 .../gh-1622-skip-invalid-upserts.test.lua     |  51 +++++++
 3 files changed, 200 insertions(+), 4 deletions(-)

diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 55a4c8fb7..b39e3b9d8 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash)
 		if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
 		    vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
 			break;
-		assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
-		vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
-		if (n_upserts <= VY_UPSERT_THRESHOLD)
-			++n_upserts;
+		/*
+		 * Upsert was not applied in vy_history_apply(),
+		 * but it still resides in tree memory. Ignore
+		 * such statements and do not account them while
+		 * counting upserts. Otherwise invalid upserts will
+		 * get stack and won't allow other upserts to squash.
+		 */
+		if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) {
+			vy_stmt_set_n_upserts((struct tuple *) mem_stmt,
+					      n_upserts);
+			if (n_upserts <= VY_UPSERT_THRESHOLD)
+				++n_upserts;
+		} else {
+			vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0);
+		}
 		vy_mem_tree_iterator_prev(&mem->tree, &mem_itr);
 	}
 
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.result b/test/vinyl/gh-1622-skip-invalid-upserts.result
index b59ed53dc..12ca65c19 100644
--- a/test/vinyl/gh-1622-skip-invalid-upserts.result
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.result
@@ -130,6 +130,140 @@ s:select{}
  | - - [2, 13]
  | ...
 
+-- Try to insert too many invalid upserts in a row so that
+-- they trigger squashing procedure.
+--
+fiber = require('fiber')
+ | ---
+ | ...
+s:drop()
+ | ---
+ | ...
+
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+ | ---
+ | ...
+pk = s:create_index('pk')
+ | ---
+ | ...
+s:replace({1, 1})
+ | ---
+ | - [1, 1]
+ | ...
+
+squashed_before = pk:stat().upsert.squashed
+ | ---
+ | ...
+applied_before = pk:stat().upsert.applied
+ | ---
+ | ...
+len_before = s:len()
+ | ---
+ | ...
+upsert_cnt = 129
+ | ---
+ | ...
+for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end
+ | ---
+ | ...
+-- 1 squash is accounted in vy_squash_process() unconditionally.
+--
+while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end
+ | ---
+ | ...
+-- Make sure invalid upserts are committed into tree.
+--
+assert(s:len() == len_before + upsert_cnt)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == applied_before)
+ | ---
+ | - true
+ | ...
+-- Adding new invalid upserts will not result in squash until
+-- we reach VY_UPSERT_THRESHOLD (128) limit again.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+ | ---
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+ | ---
+ | - true
+ | ...
+for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end
+ | ---
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+ | ---
+ | - true
+ | ...
+s:upsert({1, 1}, {{'=', 3, 5}})
+ | ---
+ | ...
+while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end
+ | ---
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == applied_before)
+ | ---
+ | - true
+ | ...
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+-- No one upsert is applied so number of rows should not change.
+--
+assert(s:len() == len_before)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == applied_before)
+ | ---
+ | - true
+ | ...
+-- Now let's mix valid and invalid upserts. Make sure that only
+-- valid are accounted during squash/apply.
+--
+for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end
+ | ---
+ | ...
+while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end
+ | ---
+ | ...
+assert(s:len() == len_before + upsert_cnt)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+ | ---
+ | - true
+ | ...
+box.snapshot()
+ | ---
+ | - ok
+ | ...
+assert(s:len() == len_before)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.squashed == squashed_before + 3)
+ | ---
+ | - true
+ | ...
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+ | ---
+ | - true
+ | ...
+
 s:drop()
  | ---
  | ...
diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
index eb93393be..27c168fcb 100644
--- a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
+++ b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua
@@ -47,4 +47,55 @@ s:select{}
 box.snapshot()
 s:select{}
 
+-- Try to insert too many invalid upserts in a row so that
+-- they trigger squashing procedure.
+--
+fiber = require('fiber')
+s:drop()
+
+s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 })
+pk = s:create_index('pk')
+s:replace({1, 1})
+
+squashed_before = pk:stat().upsert.squashed
+applied_before = pk:stat().upsert.applied
+len_before = s:len()
+upsert_cnt = 129
+for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end
+-- 1 squash is accounted in vy_squash_process() unconditionally.
+--
+while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end
+-- Make sure invalid upserts are committed into tree.
+--
+assert(s:len() == len_before + upsert_cnt)
+assert(pk:stat().upsert.applied == applied_before)
+-- Adding new invalid upserts will not result in squash until
+-- we reach VY_UPSERT_THRESHOLD (128) limit again.
+--
+s:upsert({1, 1}, {{'=', 3, 5}})
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end
+assert(pk:stat().upsert.squashed == squashed_before + 1)
+s:upsert({1, 1}, {{'=', 3, 5}})
+while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+assert(pk:stat().upsert.applied == applied_before)
+box.snapshot()
+-- No one upsert is applied so number of rows should not change.
+--
+assert(s:len() == len_before)
+assert(pk:stat().upsert.squashed == squashed_before + 2)
+assert(pk:stat().upsert.applied == applied_before)
+-- Now let's mix valid and invalid upserts. Make sure that only
+-- valid are accounted during squash/apply.
+--
+for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end
+while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end
+assert(s:len() == len_before + upsert_cnt)
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+box.snapshot()
+assert(s:len() == len_before)
+assert(pk:stat().upsert.squashed == squashed_before + 3)
+assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1)
+
 s:drop()
\ No newline at end of file
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-05-22  2:42   ` Nikita Pettik
@ 2020-05-26 21:33     ` Vladislav Shpilevoy
  2020-05-27 20:10       ` Nikita Pettik
  2020-06-22 14:13     ` Aleksandr Lyapunov
  1 sibling, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-05-26 21:33 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

Thanks for the patch, nice bug you found!

>> - vy_tx_write() - here the upsert is skipped in case of any error, but
>>   what if it was not a ClientError?
> 
> Originally, there was a comment:
> 
> /*
>  * Ignore a memory error, because it is
>  * not critical to apply the optimization.
>  */
> 
> So I decided to keep this logic. I guess returning -1 in case of OOM
> is also OK. So I can add check here as well. Let me know.

Aha, this is an optimization. I didn't notice that the tuple ends up
in the tree anyway.

> Btw, I've found another one bug concerned with upserts. It is revealed
> since upserts can turn out to be invalid not only due to OOM.
> Here's the patch (see also as an attachment). It is located on the top
> of np/gh-1622-skip-invalid-upserts branch.
> 
> Author: Nikita Pettik <korablev@tarantool.org>
> Date:   Fri May 22 04:38:45 2020 +0300
> 
>     vinyl: handle invalid upserts in background squash process
>     
>     When L0 collects more than certain number of upserts (to be more precise
>     VY_UPSERT_THRESHOLD == 128), background process is launched to squash
>     them. During this squash, new statements could be inserted into L0.
>     Until committed prepared statements feature lsn = MAX_LSN as a special
>     marker.

Did you mean 'until' -> 'while'?

>     It is assumed that all upserts will be successfully squashed. As
>     a result statements with given key in L0 will be able to have only
>     MAX_LSN lsn (except for the squash result).

Honestly, I didn't understand anything from this. I got more info from
reading squashing code source. Probably this needs some rewording. For
example, in the second sentence say 'After squash it is expected all
newer statements on top of the squashed tuple have MAX_LSN, i.e. they
are prepared but not committed. Otherwise they would end up being squashed
too.'

>     However, it is not so if at
>     least one upsert is failed to be applied: it will get stack in L0 until

stack -> stuck? The same in the code comments.

>     dump and it has normal (i.e. < MAX_LSN) lsn. So the assertion which>     verifies that lsn of upsert in L0 is greater than MAX_LSN is wrong.
>     Moreover, if we execute enough (> 128) invalid upserts, they will
>     prevent from squashing normal upserts. So let's skip invalid upserts and
>     don't account them while calculating number of upserts residing in L0.
>     
>     Follow-up #1622
> 
> diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> index 55a4c8fb7..b39e3b9d8 100644
> --- a/src/box/vinyl.c
> +++ b/src/box/vinyl.c
> @@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash)
>                 if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
>                     vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
>                         break;
> -               assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
> -               vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
> -               if (n_upserts <= VY_UPSERT_THRESHOLD)
> -                       ++n_upserts;
> +               /*
> +                * Upsert was not applied in vy_history_apply(),
> +                * but it still resides in tree memory.

Could we remove these upserts from memory right away? The code below was not
simple beforehand, and now looks even more complex. I am looking for a way
to simplify it anyhow.

Could vy_tx_write() fix help? If you would skip them there instead of writing
them to the tree anyway. Or is it something totally not related?

> +                * Ignore
> +                * such statements and do not account them while
> +                * counting upserts. Otherwise invalid upserts will
> +                * get stack and won't allow other upserts to squash.
> +                */
> +               if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) {
> +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt,
> +                                             n_upserts);
> +                       if (n_upserts <= VY_UPSERT_THRESHOLD)
> +                               ++n_upserts;
> +               } else {
> +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0);
> +               }
>                 vy_mem_tree_iterator_prev(&mem->tree, &mem_itr);
>         }

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-01  0:31   ` Vladislav Shpilevoy
  2020-05-14  2:21     ` Nikita Pettik
@ 2020-05-26 21:33     ` Vladislav Shpilevoy
  2020-05-27 20:05       ` Nikita Pettik
  1 sibling, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-05-26 21:33 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

Hi! Thanks for the patch!

> diff --git a/src/box/vy_history.c b/src/box/vy_history.c
> index 0f3b71195..2232348f9 100644
> --- a/src/box/vy_history.c
> +++ b/src/box/vy_history.c
> @@ -102,12 +102,22 @@ vy_history_apply(struct vy_history *history, struct key_def *cmp_def,
>  	while (node != NULL) {
>  		struct tuple *stmt = vy_apply_upsert(node->stmt, curr_stmt,
>  						     cmp_def, format, true);
> -		++*upserts_applied;
> -		if (curr_stmt != NULL)
> -			tuple_unref(curr_stmt);
> -		if (stmt == NULL)
> -			return -1;
> -		curr_stmt = stmt;
> +		if (stmt == NULL) {
> +			/*
> +			 * In case statement hasn't been applied,
> +			 * simply skip it ignoring errors (otherwise
> +			 * error will appear during tuple read).
> +			 */
> +			assert(diag_last_error(diag_get()) != NULL);
> +			struct error *e = diag_last_error(diag_get());
> +			if (e->type != &type_ClientError)
> +				return -1;

*. Shouldn't we log here?

> diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
> index 04c9926a8..3d630fc01 100644
> --- a/src/box/vy_lsm.c
> +++ b/src/box/vy_lsm.c
> @@ -984,13 +984,20 @@ vy_lsm_commit_upsert(struct vy_lsm *lsm, struct vy_mem *mem,
>  		struct tuple *upserted =
>  			vy_apply_upsert(stmt, older, lsm->cmp_def,
>  					lsm->mem_format, false);
> -		lsm->stat.upsert.applied++;
> -
>  		if (upserted == NULL) {
> -			/* OOM */
> +			/*
> +			 * OOM or upserted tuple does not fulfill
> +			 * space format. Hence, upsert can't be
> +			 * transformed into replace. Log error
> +			 * and exit.
> +			 */
> +			struct error *e = diag_last_error(diag_get());
> +			assert(e != NULL);
> +			error_log(e);

*. These three lines can be replaced with diag_log();.

> diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
> index 5029bd8a1..1a2ea43d9 100644
> --- a/src/box/vy_tx.c
> +++ b/src/box/vy_tx.c
> @@ -515,11 +515,11 @@ vy_tx_write(struct vy_lsm *lsm, struct vy_mem *mem,
>  						    region_stmt);
>  				tuple_unref(applied);
>  				return rc;
> +			} else {
> +				struct error *e = diag_last_error(diag_get());
> +				assert(e != NULL);
> +				error_log(e);

*. The same here.

> diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
> index 7a6a20627..4d34d96c8 100644
> --- a/src/box/vy_write_iterator.c
> +++ b/src/box/vy_write_iterator.c
> @@ -850,10 +850,23 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
>  		       vy_stmt_type(hint) != IPROTO_UPSERT);
>  		struct tuple *applied = vy_apply_upsert(h->tuple, hint,
>  				stream->cmp_def, stream->format, false);
> -		if (applied == NULL)
> -			return -1;
> -		vy_stmt_unref_if_possible(h->tuple);
> -		h->tuple = applied;
> +		if (applied == NULL) {
> +			/*
> +			 * Current upsert can't be applied.
> +			 * Let's skip it and log error.
> +			 */
> +			struct error *e = diag_last_error(diag_get());
> +			assert(e != NULL);
> +			if (e->type != &type_ClientError)
> +				return -1;
> +			say_info("upsert %s is not applied due to the error: %s",
> +				 vy_stmt_str(h->tuple), e->errmsg);
> +			vy_stmt_unref_if_possible(h->tuple);

*. Why here we use sometimes say_info() and sometimes error_log()? Why
not something one?

> @@ -864,10 +877,17 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
>  		assert(result->tuple != NULL);
>  		struct tuple *applied = vy_apply_upsert(h->tuple, result->tuple,
>  					stream->cmp_def, stream->format, false);
> -		if (applied == NULL)
> -			return -1;
> -		vy_stmt_unref_if_possible(result->tuple);
> -		result->tuple = applied;
> +		if (applied == NULL) {
> +			struct error *e = diag_last_error(diag_get());
> +			assert(e != NULL);
> +			if (e->type != &type_ClientError)
> +				return -1;
> +			say_info("upsert %s is not applied due to the error: %s",
> +				 vy_stmt_str(h->tuple), e->errmsg);
> +		} else {
> +			vy_stmt_unref_if_possible(result->tuple);
> +			result->tuple = applied;
> +		}
>  		vy_stmt_unref_if_possible(h->tuple);
>  		/*
>  		 * Don't bother freeing 'h' since it's

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-26 21:33     ` Vladislav Shpilevoy
@ 2020-05-27 20:05       ` Nikita Pettik
  2020-05-29 21:47         ` Vladislav Shpilevoy
  0 siblings, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-05-27 20:05 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 26 May 23:33, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
> 
> > diff --git a/src/box/vy_history.c b/src/box/vy_history.c
> > index 0f3b71195..2232348f9 100644
> > --- a/src/box/vy_history.c
> > +++ b/src/box/vy_history.c
> > @@ -102,12 +102,22 @@ vy_history_apply(struct vy_history *history, struct key_def *cmp_def,
> >  	while (node != NULL) {
> >  		struct tuple *stmt = vy_apply_upsert(node->stmt, curr_stmt,
> >  						     cmp_def, format, true);
> > -		++*upserts_applied;
> > -		if (curr_stmt != NULL)
> > -			tuple_unref(curr_stmt);
> > -		if (stmt == NULL)
> > -			return -1;
> > -		curr_stmt = stmt;
> > +		if (stmt == NULL) {
> > +			/*
> > +			 * In case statement hasn't been applied,
> > +			 * simply skip it ignoring errors (otherwise
> > +			 * error will appear during tuple read).
> > +			 */
> > +			assert(diag_last_error(diag_get()) != NULL);
> > +			struct error *e = diag_last_error(diag_get());
> > +			if (e->type != &type_ClientError)
> > +				return -1;
> 
> *. Shouldn't we log here?

No, we shouldn't. vy_history_apply() is called only during reads.
So in order to avoid spam the same error each time on data selection,
we don't log it here.
 
> > diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
> > index 04c9926a8..3d630fc01 100644
> > --- a/src/box/vy_lsm.c
> > +++ b/src/box/vy_lsm.c
> > @@ -984,13 +984,20 @@ vy_lsm_commit_upsert(struct vy_lsm *lsm, struct vy_mem *mem,
> >  		struct tuple *upserted =
> >  			vy_apply_upsert(stmt, older, lsm->cmp_def,
> >  					lsm->mem_format, false);
> > -		lsm->stat.upsert.applied++;
> > -
> >  		if (upserted == NULL) {
> > -			/* OOM */
> > +			/*
> > +			 * OOM or upserted tuple does not fulfill
> > +			 * space format. Hence, upsert can't be
> > +			 * transformed into replace. Log error
> > +			 * and exit.
> > +			 */
> > +			struct error *e = diag_last_error(diag_get());
> > +			assert(e != NULL);
> > +			error_log(e);
> 
> *. These three lines can be replaced with diag_log();.

Ok, let's use diag_log():

diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
index 3d630fc01..c768d0229 100644
--- a/src/box/vy_lsm.c
+++ b/src/box/vy_lsm.c
@@ -991,10 +991,7 @@ vy_lsm_commit_upsert(struct vy_lsm *lsm, struct vy_mem *mem,
                         * transformed into replace. Log error
                         * and exit.
                         */
-                       struct error *e = diag_last_error(diag_get());
-                       assert(e != NULL);
-                       error_log(e);
-                       diag_clear(diag_get());
+                       diag_log();
                        return;
                }
                lsm->stat.upsert.applied++;
diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
index 1a2ea43d9..0a14478fc 100644
--- a/src/box/vy_tx.c
+++ b/src/box/vy_tx.c
@@ -516,9 +516,7 @@ vy_tx_write(struct vy_lsm *lsm, struct vy_mem *mem,
                                tuple_unref(applied);
                                return rc;
                        } else {
-                               struct error *e = diag_last_error(diag_get());
-                               assert(e != NULL);
-                               error_log(e);
+                               diag_log();
                        }
                }
        } else {


> > diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
> > index 5029bd8a1..1a2ea43d9 100644
> > --- a/src/box/vy_tx.c
> > +++ b/src/box/vy_tx.c
> > @@ -515,11 +515,11 @@ vy_tx_write(struct vy_lsm *lsm, struct vy_mem *mem,
> >  						    region_stmt);
> >  				tuple_unref(applied);
> >  				return rc;
> > +			} else {
> > +				struct error *e = diag_last_error(diag_get());
> > +				assert(e != NULL);
> > +				error_log(e);
> 
> *. The same here.

See diff above.
 
> > diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
> > index 7a6a20627..4d34d96c8 100644
> > --- a/src/box/vy_write_iterator.c
> > +++ b/src/box/vy_write_iterator.c
> > @@ -850,10 +850,23 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
> >  		       vy_stmt_type(hint) != IPROTO_UPSERT);
> >  		struct tuple *applied = vy_apply_upsert(h->tuple, hint,
> >  				stream->cmp_def, stream->format, false);
> > -		if (applied == NULL)
> > -			return -1;
> > -		vy_stmt_unref_if_possible(h->tuple);
> > -		h->tuple = applied;
> > +		if (applied == NULL) {
> > +			/*
> > +			 * Current upsert can't be applied.
> > +			 * Let's skip it and log error.
> > +			 */
> > +			struct error *e = diag_last_error(diag_get());
> > +			assert(e != NULL);
> > +			if (e->type != &type_ClientError)
> > +				return -1;
> > +			say_info("upsert %s is not applied due to the error: %s",
> > +				 vy_stmt_str(h->tuple), e->errmsg);
> > +			vy_stmt_unref_if_possible(h->tuple);
> 
> *. Why here we use sometimes say_info() and sometimes error_log()? Why
> not something one?

Indeed, let's use here say_error():

diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
index 4d34d96c8..ce700f260 100644
--- a/src/box/vy_write_iterator.c
+++ b/src/box/vy_write_iterator.c
@@ -859,8 +859,8 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
                        assert(e != NULL);
                        if (e->type != &type_ClientError)
                                return -1;
-                       say_info("upsert %s is not applied due to the error: %s",
-                                vy_stmt_str(h->tuple), e->errmsg);
+                       say_error("upsert %s is not applied due to the error: %s",
+                                 vy_stmt_str(h->tuple), e->errmsg);
                        vy_stmt_unref_if_possible(h->tuple);
                        h->tuple = hint;
                } else {
@@ -882,8 +882,8 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
                        assert(e != NULL);
                        if (e->type != &type_ClientError)
                                return -1;
-                       say_info("upsert %s is not applied due to the error: %s",
-                                vy_stmt_str(h->tuple), e->errmsg);
+                       say_error("upsert %s is not applied due to the error: %s",
+                                 vy_stmt_str(h->tuple), e->errmsg);
                } else {
                        vy_stmt_unref_if_possible(result->tuple);
                        result->tuple = applied;

 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-05-26 21:33     ` Vladislav Shpilevoy
@ 2020-05-27 20:10       ` Nikita Pettik
  0 siblings, 0 replies; 29+ messages in thread
From: Nikita Pettik @ 2020-05-27 20:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 26 May 23:33, Vladislav Shpilevoy wrote:
> Thanks for the patch, nice bug you found!
> 
> > Author: Nikita Pettik <korablev@tarantool.org>
> > Date:   Fri May 22 04:38:45 2020 +0300
> > 
> >     vinyl: handle invalid upserts in background squash process
> >     
> >     When L0 collects more than certain number of upserts (to be more precise
> >     VY_UPSERT_THRESHOLD == 128), background process is launched to squash
> >     them. During this squash, new statements could be inserted into L0.
> >     Until committed prepared statements feature lsn = MAX_LSN as a special
> >     marker.
> 
> Did you mean 'until' -> 'while'?
> 
> >     It is assumed that all upserts will be successfully squashed. As
> >     a result statements with given key in L0 will be able to have only
> >     MAX_LSN lsn (except for the squash result).
> 
> Honestly, I didn't understand anything from this. I got more info from
> reading squashing code source. Probably this needs some rewording. For
> example, in the second sentence say 'After squash it is expected all
> newer statements on top of the squashed tuple have MAX_LSN, i.e. they
> are prepared but not committed. Otherwise they would end up being squashed
> too.'

Ok, sorry, probably it was really messy one. Look at this version:

vinyl: handle invalid upserts in background squash process

When L0 collects more than certain number of upserts (to be more precise
VY_UPSERT_THRESHOLD == 128), background process is launched to squash
them. Squash process does not prevent from handling new statements
being inserted in L0. Prepared (but still not committed) statements
feature lsn >= MAX_LSN as a special marker.
It is assumed (by mistake) that all upserts will be successfully
squashed. According to this assumption, the previous statement to squash
result is considered to have lsn >= MAX_LSN (as it is prepared but
still not committed; if it has been already committed then it is
affected by squash process). Obviously, it is not so if squashing fails.
In this case upserts which are not squashed remain in L0 but have normal
(i.e. < MAX_LSN) lsn. So the assertion which verifies that previous
statements are prepared but not committed is wrong.
Moreover, if we execute enough (more than threshold limit) invalid upserts,
they will prevent from squashing valid upserts. So let's skip invalid
upserts and don't account them while calculating number of upserts
residing in L0.

Follow-up #1622

> 
> >     However, it is not so if at
> >     least one upsert is failed to be applied: it will get stack in L0 until
> 
> stack -> stuck? The same in the code comments.

Fixed.

> > diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> > index 55a4c8fb7..b39e3b9d8 100644
> > --- a/src/box/vinyl.c
> > +++ b/src/box/vinyl.c
> > @@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash)
> >                 if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
> >                     vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
> >                         break;
> > -               assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
> > -               vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
> > -               if (n_upserts <= VY_UPSERT_THRESHOLD)
> > -                       ++n_upserts;
> > +               /*
> > +                * Upsert was not applied in vy_history_apply(),
> > +                * but it still resides in tree memory.
> 
> Could we remove these upserts from memory right away? The code below was not
> simple beforehand, and now looks even more complex. I am looking for a way
> to simplify it anyhow.
> 
> Could vy_tx_write() fix help? If you would skip them there instead of writing
> them to the tree anyway. Or is it something totally not related?

These invalid upserts are already committed and residing in tree.
The only way to delete them is to generate DELETE statement and
insert it to the tree. I guess I can try this approach (I considered
it while was preparing this patch) but code definitely won't become
simpler (that's why I've choosen current one).
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-27 20:05       ` Nikita Pettik
@ 2020-05-29 21:47         ` Vladislav Shpilevoy
  2020-06-01 19:24           ` Nikita Pettik
  0 siblings, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-05-29 21:47 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

Thanks for the fixes!

>>> diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
>>> index 7a6a20627..4d34d96c8 100644
>>> --- a/src/box/vy_write_iterator.c
>>> +++ b/src/box/vy_write_iterator.c
>>> @@ -850,10 +850,23 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
>>>  		       vy_stmt_type(hint) != IPROTO_UPSERT);
>>>  		struct tuple *applied = vy_apply_upsert(h->tuple, hint,
>>>  				stream->cmp_def, stream->format, false);
>>> -		if (applied == NULL)
>>> -			return -1;
>>> -		vy_stmt_unref_if_possible(h->tuple);
>>> -		h->tuple = applied;
>>> +		if (applied == NULL) {
>>> +			/*
>>> +			 * Current upsert can't be applied.
>>> +			 * Let's skip it and log error.
>>> +			 */
>>> +			struct error *e = diag_last_error(diag_get());
>>> +			assert(e != NULL);
>>> +			if (e->type != &type_ClientError)
>>> +				return -1;
>>> +			say_info("upsert %s is not applied due to the error: %s",
>>> +				 vy_stmt_str(h->tuple), e->errmsg);
>>> +			vy_stmt_unref_if_possible(h->tuple);
>>
>> *. Why here we use sometimes say_info() and sometimes error_log()? Why
>> not something one?
> 
> Indeed, let's use here say_error():

Now in some places we use diag_log() and in other say_error() + vy_stmt_str().
How is it chosen when what to use?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash
  2020-05-29 21:47         ` Vladislav Shpilevoy
@ 2020-06-01 19:24           ` Nikita Pettik
  0 siblings, 0 replies; 29+ messages in thread
From: Nikita Pettik @ 2020-06-01 19:24 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 29 May 23:47, Vladislav Shpilevoy wrote:
> Thanks for the fixes!
> 
> >>> diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
> >>> index 7a6a20627..4d34d96c8 100644
> >>> --- a/src/box/vy_write_iterator.c
> >>> +++ b/src/box/vy_write_iterator.c
> >>> @@ -850,10 +850,23 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
> >>>  		       vy_stmt_type(hint) != IPROTO_UPSERT);
> >>>  		struct tuple *applied = vy_apply_upsert(h->tuple, hint,
> >>>  				stream->cmp_def, stream->format, false);
> >>> -		if (applied == NULL)
> >>> -			return -1;
> >>> -		vy_stmt_unref_if_possible(h->tuple);
> >>> -		h->tuple = applied;
> >>> +		if (applied == NULL) {
> >>> +			/*
> >>> +			 * Current upsert can't be applied.
> >>> +			 * Let's skip it and log error.
> >>> +			 */
> >>> +			struct error *e = diag_last_error(diag_get());
> >>> +			assert(e != NULL);
> >>> +			if (e->type != &type_ClientError)
> >>> +				return -1;
> >>> +			say_info("upsert %s is not applied due to the error: %s",
> >>> +				 vy_stmt_str(h->tuple), e->errmsg);
> >>> +			vy_stmt_unref_if_possible(h->tuple);
> >>
> >> *. Why here we use sometimes say_info() and sometimes error_log()? Why
> >> not something one?
> > 
> > Indeed, let's use here say_error():
> 
> Now in some places we use diag_log() and in other say_error() + vy_stmt_str().
> How is it chosen when what to use?


This place (vy_read_view_merge()) is called only during dump and compaction
processing. Logging tuple itself would significantly simply diagnostics.
Meanwhile in other places diag_log() fires once wrong tuple is inserted.
Btw, Konstantin agreed with this logging way:

>* Nikita Pettik <korablev@tarantool.org> [20/05/14 09:50]:
> On 14 Apr 01:12, Konstantin Osipov wrote:
> > * Nikita Pettik <korablev@tarantool.org> [20/04/14 00:57]:
> > > Instead of aborting merge sequence of upserts let's log appeared
> > > errors and skip upserts which can't be applied. It makes sense
> > > taking into consideration previous commit: now upsert statements which
> > > can't be applied may appear in mems/runs (previously squash operation
> > > might fail only due to OOM). As a result, if we didn't ignore invalid
> > > upserts, dump or compaction processes would not be able to finish (owing
> > > to inability to squash upserts).
> >
> > We should log the upsert itself, base64 encoded.
>
> It is logged with following format (in vy_read_view_merge()):
>
> say_info("upsert %s is not applied due to the error: %s",
> vy_stmt_str(h->tuple), e->errmsg);
>
> It is the function which is called during compaction to squash
> upserts.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-04-13 21:55 [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Nikita Pettik
                   ` (2 preceding siblings ...)
  2020-05-20 22:13 ` [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Vladislav Shpilevoy
@ 2020-06-02 21:36 ` Vladislav Shpilevoy
  2020-06-02 21:37   ` Vladislav Shpilevoy
  3 siblings, 1 reply; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-02 21:36 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

Hi! Thanks for the patchset!

LGTM.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-06-02 21:36 ` Vladislav Shpilevoy
@ 2020-06-02 21:37   ` Vladislav Shpilevoy
  0 siblings, 0 replies; 29+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-02 21:37 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches

Although you could probably ask Alexander L. for a second review.

On 02/06/2020 23:36, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patchset!
> 
> LGTM.
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-05-22  2:42   ` Nikita Pettik
  2020-05-26 21:33     ` Vladislav Shpilevoy
@ 2020-06-22 14:13     ` Aleksandr Lyapunov
  2020-06-22 20:21       ` Nikita Pettik
  1 sibling, 1 reply; 29+ messages in thread
From: Aleksandr Lyapunov @ 2020-06-22 14:13 UTC (permalink / raw)
  To: Nikita Pettik, Vladislav Shpilevoy; +Cc: tarantool-patches

HI, thanks for the patch! see one comment below

On 5/22/20 5:42 AM, Nikita Pettik wrote:
>
> diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> index 55a4c8fb7..b39e3b9d8 100644
> --- a/src/box/vinyl.c
> +++ b/src/box/vinyl.c
> @@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash)
>                  if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
>                      vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
>                          break;
> -               assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
> -               vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
> -               if (n_upserts <= VY_UPSERT_THRESHOLD)
> -                       ++n_upserts;
> +               /*
> +                * Upsert was not applied in vy_history_apply(),
> +                * but it still resides in tree memory. Ignore
> +                * such statements and do not account them while
> +                * counting upserts. Otherwise invalid upserts will
> +                * get stack and won't allow other upserts to squash.
> +                */
> +               if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) {
> +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt,
> +                                             n_upserts);
> +                       if (n_upserts <= VY_UPSERT_THRESHOLD)
> +                               ++n_upserts;
> +               } else {
> +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0);
> +               }
I bet we should just remove assert here. If mem_stmt have
a valid LSN now - it is a valid statement. The first of them must
have n_upserts=0, the second - 1 etc, and n_upsert of following
prepared statements must continue the sequence.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied
  2020-04-13 21:55 ` [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied Nikita Pettik
@ 2020-06-22 19:28   ` Aleksandr Lyapunov
  0 siblings, 0 replies; 29+ messages in thread
From: Aleksandr Lyapunov @ 2020-06-22 19:28 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches; +Cc: v.shpilevoy

Hi! thank again. See my comment below.

On 4/14/20 12:55 AM, Nikita Pettik wrote:
> diff --git a/src/box/vy_upsert.c b/src/box/vy_upsert.c
> index ebea2789c..6855b9820 100644
> --- a/src/box/vy_upsert.c
> +++ b/src/box/vy_upsert.c
> @@ -134,6 +134,10 @@ vy_apply_upsert(const struct tuple *new_stmt, const struct tuple *old_stmt,
>   					 &mp_size, 0, suppress_error,
>   					 &column_mask);
>   	result_mp_end = result_mp + mp_size;
> +	if (tuple_validate_raw(format, result_mp) != 0) {
> +		region_truncate(region, region_svp);
> +		return NULL;
> +	}
That is wrong for UPSERT-UPSERT case.
The result of applying an upsert to another must be such a cumulative
upsert that being applied to anything must have the same result as
two original upserts applied sequentially.
Also in any case, inapplicable ops must be skipped and inapplicable whole
upsert must be skipped.
Suppose we have two upserts:
upsert1 = {tuple1, {ops1}} and upsert2 = {tuple2, {ops2}},
and {ops2} are not applicable to tuple1.
What cumulative upsert must we create?
1)if apllied to NULL or DELETE first upsert must insert tuple1 and upsert2
must be skipped and thus the result must be tuple1.
2)if applied to some good tuple - all ops {ops1,ops2} must be applied.
Thus the cumulative upsert must be like:
{tuple1, {ops1, ops2}}
But in your code you create:
{tuple1, {ops1}}

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-06-22 14:13     ` Aleksandr Lyapunov
@ 2020-06-22 20:21       ` Nikita Pettik
  2020-06-23 12:32         ` Aleksandr Lyapunov
  0 siblings, 1 reply; 29+ messages in thread
From: Nikita Pettik @ 2020-06-22 20:21 UTC (permalink / raw)
  To: Aleksandr Lyapunov; +Cc: tarantool-patches, Vladislav Shpilevoy

On 22 Jun 17:13, Aleksandr Lyapunov wrote:
> HI, thanks for the patch! see one comment below
> 
> On 5/22/20 5:42 AM, Nikita Pettik wrote:
> > 
> > diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> > index 55a4c8fb7..b39e3b9d8 100644
> > --- a/src/box/vinyl.c
> > +++ b/src/box/vinyl.c
> > @@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash)
> >                  if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
> >                      vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
> >                          break;
> > -               assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
> > -               vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
> > -               if (n_upserts <= VY_UPSERT_THRESHOLD)
> > -                       ++n_upserts;
> > +               /*
> > +                * Upsert was not applied in vy_history_apply(),
> > +                * but it still resides in tree memory. Ignore
> > +                * such statements and do not account them while
> > +                * counting upserts. Otherwise invalid upserts will
> > +                * get stack and won't allow other upserts to squash.
> > +                */
> > +               if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) {
> > +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt,
> > +                                             n_upserts);
> > +                       if (n_upserts <= VY_UPSERT_THRESHOLD)
> > +                               ++n_upserts;
> > +               } else {
> > +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0);
> > +               }
> I bet we should just remove assert here. If mem_stmt have
> a valid LSN now - it is a valid statement. The first of them must
> have n_upserts=0, the second - 1 etc, and n_upsert of following
> prepared statements must continue the sequence.

What is the point of keeping counting n_upserts for upserts
which can't be squashed? For sure we can simply remove assertion,
but then after accumulating more than 128 upserts which can't
be applied, vy_squash_process won't be called till dump...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied
  2020-06-22 20:21       ` Nikita Pettik
@ 2020-06-23 12:32         ` Aleksandr Lyapunov
  0 siblings, 0 replies; 29+ messages in thread
From: Aleksandr Lyapunov @ 2020-06-23 12:32 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, Vladislav Shpilevoy


>> I bet we should just remove assert here. If mem_stmt have
>> a valid LSN now - it is a valid statement. The first of them must
>> have n_upserts=0, the second - 1 etc, and n_upsert of following
>> prepared statements must continue the sequence.
> What is the point of keeping counting n_upserts for upserts
> which can't be squashed? For sure we can simply remove assertion,
> but then after accumulating more than 128 upserts which can't
> be applied, vy_squash_process won't be called till dump...
>
Why can't? After accumulating more 128 upserts squash process
must be restarted and squash those upserts that became committed.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2020-06-23 12:32 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-13 21:55 [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Nikita Pettik
2020-04-13 21:55 ` [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied Nikita Pettik
2020-06-22 19:28   ` Aleksandr Lyapunov
2020-04-13 21:55 ` [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash Nikita Pettik
2020-04-13 22:12   ` Konstantin Osipov
2020-05-14  2:11     ` Nikita Pettik
2020-05-14  6:56       ` Konstantin Osipov
2020-05-19 19:10         ` Nikita Pettik
2020-05-19 19:39           ` Konstantin Osipov
2020-05-21  2:51             ` Nikita Pettik
2020-05-21  8:36               ` Konstantin Osipov
2020-05-01  0:31   ` Vladislav Shpilevoy
2020-05-14  2:21     ` Nikita Pettik
2020-05-14 21:32       ` Vladislav Shpilevoy
2020-05-19 18:18         ` Nikita Pettik
2020-05-20 22:13           ` Vladislav Shpilevoy
2020-05-26 21:33     ` Vladislav Shpilevoy
2020-05-27 20:05       ` Nikita Pettik
2020-05-29 21:47         ` Vladislav Shpilevoy
2020-06-01 19:24           ` Nikita Pettik
2020-05-20 22:13 ` [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Vladislav Shpilevoy
2020-05-22  2:42   ` Nikita Pettik
2020-05-26 21:33     ` Vladislav Shpilevoy
2020-05-27 20:10       ` Nikita Pettik
2020-06-22 14:13     ` Aleksandr Lyapunov
2020-06-22 20:21       ` Nikita Pettik
2020-06-23 12:32         ` Aleksandr Lyapunov
2020-06-02 21:36 ` Vladislav Shpilevoy
2020-06-02 21:37   ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox