From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <vdavydov.dev@gmail.com>
From: Vladimir Davydov <vdavydov.dev@gmail.com>
Subject: [PATCH] vinyl: fix primary index uniqueness check being skipped on insert
Date: Fri, 24 Aug 2018 22:47:58 +0300
Message-Id: <45896f24f4318084e0df2e10631da239653f64d4.1535139967.git.vdavydov.dev@gmail.com>
To: kostja@tarantool.org
Cc: tarantool-patches@freelists.org
List-ID: <tarantool-patches.dev.tarantool.org>

Since we check uniqueness constraint before inserting anything into the
transaction write set, we have to deal with the situation when a
secondary index doesn't get updated. For example suppose there's tuple
{1, 1, 1} stored in a space with the primary index over the first field
and a unique secondary index over the second field. Then when processing
REPLACE {1, 1, 2}, we will find {1, 1, 1} in the secondary index, but
that doesn't mean that there's a duplicate key error - since the primary
key parts of the old and new tuples coincide, the secondary index
doesn't in fact get updated hence there's no conflict.

However, if the operation was INSERT {1, 1, 2}, then there would be a
conflict - by the primary index. Normally, we would detect such a
conflict when checking the uniqueness constraint of the primary index,
i.e. in vy_check_is_unique_primary(), but there's a case when this
doesn't happen. The point is we can optimize out the primary index
uniqueness constraint check in case the primary index key parts contain
all parts of a unique secondary index, see #3154. In such a case we must
fail vy_check_is_unique_secondary() even if the conflicting tuple has
the same primary key parts.

Fixes commit fc3834c01e ("vinyl: check key uniqueness before modifying
tx write set")

Closes #3643
---
https://github.com/tarantool/tarantool/issues/3643
https://github.com/tarantool/tarantool/commits/dv/gh-3643-vy-fix-unique-check-skip-on-insert

 src/box/vinyl.c          | 47 +++++++++++++++++++++++++++++------------------
 test/vinyl/misc.result   | 26 +++++++++++++++++++++++++-
 test/vinyl/misc.test.lua | 13 ++++++++++++-
 3 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 798a37f8..6d5ea380 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -1516,14 +1516,18 @@ vy_check_is_unique_secondary(struct vy_tx *tx, const struct vy_read_view **rv,
 	tuple_unref(key);
 	if (rc != 0)
 		return -1;
-	if (found != NULL && vy_tuple_compare(stmt, found,
-					      lsm->pk->key_def) == 0) {
-		/*
-		 * If the old and new tuples are the same in
-		 * terms of the primary key definition, the
-		 * statement doesn't modify the secondary key
-		 * and so there's actually no conflict.
-		 */
+	/*
+	 * The old and new tuples may happen to be the same in
+	 * terms of the primary key definition. For REPLACE this
+	 * means that the operation overwrites the old tuple
+	 * without modifying the secondary key and so there's
+	 * actually no conflict. For INSERT this can only happen
+	 * if we optimized out the primary index uniqueness check
+	 * (see vy_lsm::check_is_unique), in which case we must
+	 * fail here.
+	 */
+	if (found != NULL && vy_stmt_type(stmt) == IPROTO_REPLACE &&
+	    vy_tuple_compare(stmt, found, lsm->pk->key_def) == 0) {
 		tuple_unref(found);
 		return 0;
 	}
@@ -4043,6 +4047,14 @@ vy_build_insert_tuple(struct vy_env *env, struct vy_lsm *lsm,
 	if (tuple_validate(new_format, tuple) != 0)
 		return -1;
 
+	/* Reallocate the new tuple using the new space format. */
+	uint32_t data_len;
+	const char *data = tuple_data_range(tuple, &data_len);
+	struct tuple *stmt = vy_stmt_new_replace(new_format, data,
+						 data + data_len);
+	if (stmt == NULL)
+		return -1;
+
 	/*
 	 * Check unique constraint if necessary.
 	 *
@@ -4056,21 +4068,20 @@ vy_build_insert_tuple(struct vy_env *env, struct vy_lsm *lsm,
 	 * store statements with greater LSNs. So we pin the
 	 * in-memory index that is active now and insert the tuple
 	 * into it after the yield.
+	 *
+	 * Also note that this operation is semantically a REPLACE
+	 * while the original tuple may have INSERT type. Since the
+	 * uniqueness check helper is sensitive to the statement
+	 * type, we must not use the original tuple for the check.
 	 */
 	vy_mem_pin(mem);
 	rc = vy_check_is_unique_secondary(NULL, &env->xm->p_committed_read_view,
-					  space_name, index_name, lsm, tuple);
+					  space_name, index_name, lsm, stmt);
 	vy_mem_unpin(mem);
-	if (rc != 0)
-		return -1;
-
-	/* Reallocate the new tuple using the new space format. */
-	uint32_t data_len;
-	const char *data = tuple_data_range(tuple, &data_len);
-	struct tuple *stmt = vy_stmt_new_replace(new_format, data,
-						 data + data_len);
-	if (stmt == NULL)
+	if (rc != 0) {
+		tuple_unref(stmt);
 		return -1;
+	}
 
 	/* Insert the new tuple into the in-memory index. */
 	size_t mem_used_before = lsregion_used(&env->mem_env.allocator);
diff --git a/test/vinyl/misc.result b/test/vinyl/misc.result
index d2a7ca68..59492f77 100644
--- a/test/vinyl/misc.result
+++ b/test/vinyl/misc.result
@@ -113,7 +113,7 @@ s:drop()
 ---
 ...
 --
--- gh-3158: check of duplicates is skipped if the index
+-- gh-3154: check of duplicates is skipped if the index
 -- is contained by another unique index which is checked.
 --
 s = box.schema.create_space('test', {engine = 'vinyl'})
@@ -180,3 +180,27 @@ i7:stat().lookup -- 1
 s:drop()
 ---
 ...
+--
+-- gh-3643: unique optimization results in skipping uniqueness
+-- check in the primary index on insertion.
+--
+s = box.schema.space.create('test', {engine = 'vinyl'})
+---
+...
+_ = s:create_index('pk', {unique = true, parts = {1, 'unsigned', 2, 'unsigned'}})
+---
+...
+_ = s:create_index('sk', {unique = true, parts = {2, 'unsigned'}})
+---
+...
+s:insert{1, 1, 1}
+---
+- [1, 1, 1]
+...
+s:insert{1, 1, 2} -- error
+---
+- error: Duplicate key exists in unique index 'sk' in space 'test'
+...
+s:drop()
+---
+...
diff --git a/test/vinyl/misc.test.lua b/test/vinyl/misc.test.lua
index 9f61ca0a..ba7403ec 100644
--- a/test/vinyl/misc.test.lua
+++ b/test/vinyl/misc.test.lua
@@ -50,7 +50,7 @@ tx2.rollback - tx1.rollback -- 1
 s:drop()
 
 --
--- gh-3158: check of duplicates is skipped if the index
+-- gh-3154: check of duplicates is skipped if the index
 -- is contained by another unique index which is checked.
 --
 s = box.schema.create_space('test', {engine = 'vinyl'})
@@ -77,3 +77,14 @@ i6:stat().lookup -- 0
 i7:stat().lookup -- 1
 
 s:drop()
+
+--
+-- gh-3643: unique optimization results in skipping uniqueness
+-- check in the primary index on insertion.
+--
+s = box.schema.space.create('test', {engine = 'vinyl'})
+_ = s:create_index('pk', {unique = true, parts = {1, 'unsigned', 2, 'unsigned'}})
+_ = s:create_index('sk', {unique = true, parts = {2, 'unsigned'}})
+s:insert{1, 1, 1}
+s:insert{1, 1, 2} -- error
+s:drop()
-- 
2.11.0