Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: kostja@tarantool.org
Cc: tarantool-patches@freelists.org
Subject: [PATCH 07/25] vinyl: check key uniqueness before modifying tx write set
Date: Fri, 27 Jul 2018 14:29:47 +0300	[thread overview]
Message-ID: <3b05419004e495d79e748dd2daa18ac5fd4464b3.1532689066.git.vdavydov.dev@gmail.com> (raw)
In-Reply-To: <cover.1532689065.git.vdavydov.dev@gmail.com>
In-Reply-To: <cover.1532689065.git.vdavydov.dev@gmail.com>

Currently, we handle INSERT/REPLACE/UPDATE requests by iterating over
all space indexes starting from the primary and inserting the
corresponding statements to tx write set, checking key uniqueness if
necessary. This means that by the time we write a REPLACE to the write
set of a secondary index, it has already been written to the primary
index write set. This is OK, and vy_tx_prepare() relies on that to
implement the common memory level. However, this also means that when we
check uniqueness of a secondary index, the new REPLACE can be found via
the primary index. This is OK now, because all indexes are fully
independent, but it isn't going to fly after #2129 is implemented. The
problem is in order to check if a tuple is present in a secondary index,
we will have to look up the corresponding full tuple in the primary
index. To illustrate the problem, consider the following situation:

  Primary index covers field 1.
  Secondary index covers field 2.

  Committed statements:

    REPLACE{1, 10, lsn=1} - present in both indexes
    DELETE{1, lsn=2} - present only in the primary index

  Transaction:

    REPLACE{1, 10}

When we check uniqueness of the secondary index, we find committed
statement REPLACE{1, 10, lsn=1}, then look up the corresponding full
tuple in the primary index and find REPLACE{1, 10}. Since the two tuples
match, we mistakenly assume that there's a conflict.

To avoid a situation like that, let's check uniqueness before modifying
the write set of any index.

Needed for #2129
---
 src/box/vinyl.c | 128 +++++++++++++++++++++++++++-----------------------------
 1 file changed, 62 insertions(+), 66 deletions(-)

diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 5da1c4bc..1d086439 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -1472,6 +1472,17 @@ vy_check_is_unique_secondary(struct vy_env *env, struct vy_tx *tx,
 	tuple_unref(key);
 	if (rc != 0)
 		return -1;
+	if (found != NULL && vy_tuple_compare(stmt, found,
+					      lsm->pk->key_def) == 0) {
+		/*
+		 * If the old and new tuples are the same in
+		 * terms of the primary key definition, the
+		 * statement doesn't modify the secondary key
+		 * and so there's actually no conflict.
+		 */
+		tuple_unref(found);
+		return 0;
+	}
 	if (found != NULL) {
 		tuple_unref(found);
 		diag_set(ClientError, ER_TUPLE_FOUND,
@@ -1482,68 +1493,51 @@ vy_check_is_unique_secondary(struct vy_env *env, struct vy_tx *tx,
 }
 
 /**
- * Insert a tuple in a primary index LSM tree.
- * @param env   Vinyl environment.
- * @param tx    Current transaction.
- * @param space Target space.
- * @param pk    Primary index LSM tree.
- * @param stmt  Tuple to insert.
- *
- * @retval  0 Success.
- * @retval -1 Memory error or duplicate key error.
- */
-static inline int
-vy_insert_primary(struct vy_env *env, struct vy_tx *tx, struct space *space,
-		  struct vy_lsm *pk, struct tuple *stmt)
-{
-	assert(vy_stmt_type(stmt) == IPROTO_INSERT);
-	assert(tx != NULL && tx->state == VINYL_TX_READY);
-	assert(pk->index_id == 0);
-	/*
-	 * A primary index is always unique and the new tuple must not
-	 * conflict with existing tuples.
-	 */
-	if (vy_check_is_unique_primary(env, tx, vy_tx_read_view(tx),
-				       space_name(space),
-				       index_name_by_id(space, pk->index_id),
-				       pk, stmt) != 0)
-		return -1;
-	return vy_tx_set(tx, pk, stmt);
-}
-
-/**
- * Insert a tuple in a secondary index LSM tree.
- * @param env       Vinyl environment.
- * @param tx        Current transaction.
- * @param space     Target space.
- * @param lsm       Secondary index LSM tree.
- * @param stmt      Tuple to replace.
+ * Check if insertion of a new tuple violates unique constraint
+ * of any index of the space.
+ * @param env        Vinyl environment.
+ * @param tx         Current transaction.
+ * @param space      Space to check.
+ * @param stmt       New tuple.
  *
- * @retval  0 Success.
- * @retval -1 Memory error or duplicate key error.
+ * @retval  0 Success, unique constraint is satisfied.
+ * @retval -1 Duplicate is found or read error occurred.
  */
 static int
-vy_insert_secondary(struct vy_env *env, struct vy_tx *tx, struct space *space,
-		    struct vy_lsm *lsm, struct tuple *stmt)
+vy_check_is_unique(struct vy_env *env, struct vy_tx *tx,
+		   struct space *space, struct tuple *stmt)
 {
+	assert(space->index_count > 0);
 	assert(vy_stmt_type(stmt) == IPROTO_INSERT ||
 	       vy_stmt_type(stmt) == IPROTO_REPLACE);
-	assert(tx != NULL && tx->state == VINYL_TX_READY);
-	assert(lsm->index_id > 0);
 
-	if (vy_check_is_unique_secondary(env, tx, vy_tx_read_view(tx),
-					 space_name(space),
-					 index_name_by_id(space, lsm->index_id),
-					 lsm, stmt) != 0)
-		return -1;
+	const struct vy_read_view **rv = vy_tx_read_view(tx);
 
 	/*
-	 * We must always append the statement to transaction write set
-	 * of each LSM tree, even if operation itself does not update
-	 * the LSM tree, e.g. it's an UPDATE, to ensure we read our
-	 * own writes.
+	 * We only need to check the uniqueness of the primary index
+	 * if this is INSERT, because REPLACE will silently overwrite
+	 * the existing tuple, if any.
 	 */
-	return vy_tx_set(tx, lsm, stmt);
+	if (vy_stmt_type(stmt) == IPROTO_INSERT) {
+		struct vy_lsm *lsm = vy_lsm(space->index[0]);
+		if (vy_check_is_unique_primary(env, tx, rv, space_name(space),
+					       index_name_by_id(space, 0),
+					       lsm, stmt) != 0)
+			return -1;
+	}
+
+	/*
+	 * For secondary indexes, uniqueness must be checked on both
+	 * INSERT and REPLACE.
+	 */
+	for (uint32_t i = 1; i < space->index_count; i++) {
+		struct vy_lsm *lsm = vy_lsm(space->index[i]);
+		if (vy_check_is_unique_secondary(env, tx, rv, space_name(space),
+						 index_name_by_id(space, i),
+						 lsm, stmt) != 0)
+			return -1;
+	}
+	return 0;
 }
 
 /**
@@ -1764,6 +1758,8 @@ vy_update(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
 	if (vy_check_update(space, pk, stmt->old_tuple, stmt->new_tuple,
 			    column_mask) != 0)
 		return -1;
+	if (vy_check_is_unique(env, tx, space, stmt->new_tuple) != 0)
+		return -1;
 
 	/*
 	 * In the primary index the tuple can be replaced without
@@ -1786,7 +1782,7 @@ vy_update(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
 			continue;
 		if (vy_tx_set(tx, lsm, delete) != 0)
 			goto error;
-		if (vy_insert_secondary(env, tx, space, lsm, stmt->new_tuple))
+		if (vy_tx_set(tx, lsm, stmt->new_tuple) != 0)
 			goto error;
 	}
 	tuple_unref(delete);
@@ -1814,13 +1810,15 @@ vy_insert_first_upsert(struct vy_env *env, struct vy_tx *tx,
 	assert(tx != NULL && tx->state == VINYL_TX_READY);
 	assert(space->index_count > 0);
 	assert(vy_stmt_type(stmt) == IPROTO_INSERT);
+	if (vy_check_is_unique(env, tx, space, stmt) != 0)
+		return -1;
 	struct vy_lsm *pk = vy_lsm(space->index[0]);
 	assert(pk->index_id == 0);
 	if (vy_tx_set(tx, pk, stmt) != 0)
 		return -1;
 	for (uint32_t i = 1; i < space->index_count; ++i) {
 		struct vy_lsm *lsm = vy_lsm(space->index[i]);
-		if (vy_insert_secondary(env, tx, space, lsm, stmt) != 0)
+		if (vy_tx_set(tx, lsm, stmt) != 0)
 			return -1;
 	}
 	return 0;
@@ -2045,6 +2043,8 @@ vy_upsert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
 		 */
 		return 0;
 	}
+	if (vy_check_is_unique(env, tx, space, stmt->new_tuple) != 0)
+		return -1;
 	if (vy_tx_set(tx, pk, stmt->new_tuple))
 		return -1;
 	if (space->index_count == 1)
@@ -2063,8 +2063,7 @@ vy_upsert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
 			continue;
 		if (vy_tx_set(tx, lsm, delete) != 0)
 			goto error;
-		if (vy_insert_secondary(env, tx, space, lsm,
-					stmt->new_tuple) != 0)
+		if (vy_tx_set(tx, lsm, stmt->new_tuple) != 0)
 			goto error;
 	}
 	tuple_unref(delete);
@@ -2107,15 +2106,16 @@ vy_insert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
 					     request->tuple_end);
 	if (stmt->new_tuple == NULL)
 		return -1;
-	if (vy_insert_primary(env, tx, space, pk, stmt->new_tuple) != 0)
+	if (vy_check_is_unique(env, tx, space, stmt->new_tuple) != 0)
+		return -1;
+	if (vy_tx_set(tx, pk, stmt->new_tuple) != 0)
 		return -1;
 
 	for (uint32_t iid = 1; iid < space->index_count; ++iid) {
 		struct vy_lsm *lsm = vy_lsm(space->index[iid]);
 		if (vy_is_committed_one(env, lsm))
 			continue;
-		if (vy_insert_secondary(env, tx, space, lsm,
-					stmt->new_tuple) != 0)
+		if (vy_tx_set(tx, lsm, stmt->new_tuple) != 0)
 			return -1;
 	}
 	return 0;
@@ -2158,6 +2158,8 @@ vy_replace(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
 					      request->tuple_end);
 	if (stmt->new_tuple == NULL)
 		return -1;
+	if (vy_check_is_unique(env, tx, space, stmt->new_tuple) != 0)
+		return -1;
 	/*
 	 * Get the overwritten tuple from the primary index if
 	 * the space has on_replace triggers, in which case we
@@ -2201,18 +2203,12 @@ vy_replace(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
 		struct vy_lsm *lsm = vy_lsm(space->index[i]);
 		if (vy_is_committed_one(env, lsm))
 			continue;
-		/*
-		 * DELETE goes first, so if old and new keys
-		 * fully match, there is no look up beyond the
-		 * transaction write set.
-		 */
 		if (delete != NULL) {
 			rc = vy_tx_set(tx, lsm, delete);
 			if (rc != 0)
 				break;
 		}
-		rc = vy_insert_secondary(env, tx, space, lsm,
-					 stmt->new_tuple);
+		rc = vy_tx_set(tx, lsm, stmt->new_tuple);
 		if (rc != 0)
 			break;
 	}
-- 
2.11.0

  parent reply	other threads:[~2018-07-27 11:29 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-27 11:29 [PATCH 00/25] vinyl: eliminate disk read on REPLACE/DELETE Vladimir Davydov
2018-07-27 11:29 ` [PATCH 01/25] vinyl: make point lookup always return the latest tuple version Vladimir Davydov
2018-07-27 11:29 ` [PATCH 02/25] vinyl: simplify vy_squash_process Vladimir Davydov
2018-07-27 11:29 ` [PATCH 03/25] vinyl: always get full tuple from pk after reading from secondary index Vladimir Davydov
2018-07-27 11:29 ` [PATCH 04/25] vinyl: fold vy_replace_one and vy_replace_impl Vladimir Davydov
2018-07-27 11:29 ` [PATCH 05/25] vinyl: fold vy_delete_impl Vladimir Davydov
2018-07-27 11:29 ` [PATCH 06/25] vinyl: refactor unique check Vladimir Davydov
2018-07-27 11:29 ` Vladimir Davydov [this message]
2018-07-27 11:29 ` [PATCH 08/25] vinyl: remove env argument of vy_check_is_unique_{primary,secondary} Vladimir Davydov
2018-07-31 20:45   ` [tarantool-patches] " Konstantin Osipov
2018-07-27 11:29 ` [PATCH 09/25] vinyl: store full tuples in secondary index cache Vladimir Davydov
2018-07-31 20:47   ` Konstantin Osipov
2018-07-27 11:29 ` [PATCH 10/25] vinyl: do not free pending tasks on shutdown Vladimir Davydov
2018-07-31 20:48   ` Konstantin Osipov
2018-07-27 11:29 ` [PATCH 11/25] vinyl: store pointer to scheduler in struct vy_task Vladimir Davydov
2018-07-31 20:49   ` Konstantin Osipov
2018-07-27 11:29 ` [PATCH 12/25] vinyl: rename some members of vy_scheduler and vy_task struct Vladimir Davydov
2018-07-27 11:29 ` [PATCH 13/25] vinyl: use cbus for communication between scheduler and worker threads Vladimir Davydov
2018-07-27 11:29 ` [PATCH 14/25] vinyl: zap vy_scheduler::is_worker_pool_running Vladimir Davydov
2018-07-27 11:29 ` [PATCH 15/25] vinyl: rename vy_task::status to is_failed Vladimir Davydov
2018-07-27 11:29 ` [PATCH 16/25] xrow: allow to store flags in DML requests Vladimir Davydov
2018-07-27 11:29 ` [PATCH 17/25] vinyl: pin last statement returned by write iterator explicitly Vladimir Davydov
2018-07-27 11:29 ` [PATCH 18/25] vinyl: teach write iterator to return overwritten tuples Vladimir Davydov
2018-07-27 11:29 ` [PATCH 19/25] vinyl: prepare write iterator heap comparator for deferred DELETEs Vladimir Davydov
2018-07-27 11:30 ` [PATCH 20/25] vinyl: allow to skip certain statements on read Vladimir Davydov
2018-07-27 11:30 ` [PATCH 21/25] vinyl: add function to create surrogate deletes from raw msgpack Vladimir Davydov
2018-07-27 11:30 ` [PATCH 22/25] vinyl: remove pointless assertion from vy_stmt_new_surrogate_delete Vladimir Davydov
2018-07-27 11:30 ` [PATCH 23/25] txn: add helper to detect transaction boundaries Vladimir Davydov
2018-07-31 20:52   ` [tarantool-patches] " Konstantin Osipov
2018-07-27 11:30 ` [PATCH 24/25] Introduce _vinyl_deferred_delete system space Vladimir Davydov
2018-07-31 20:54   ` Konstantin Osipov
2018-08-01 14:00     ` Vladimir Davydov
2018-08-01 20:25       ` [tarantool-patches] " Konstantin Osipov
2018-08-02  9:43         ` Vladimir Davydov
2018-08-06  8:42           ` Vladimir Davydov
2018-07-27 11:30 ` [PATCH 25/25] vinyl: eliminate disk read on REPLACE/DELETE Vladimir Davydov
2018-07-31 20:55   ` Konstantin Osipov
2018-08-01 16:03     ` Vladimir Davydov
2018-08-01 16:51     ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b05419004e495d79e748dd2daa18ac5fd4464b3.1532689066.git.vdavydov.dev@gmail.com \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [PATCH 07/25] vinyl: check key uniqueness before modifying tx write set' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox