Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Konstantin Osipov <kostja@tarantool.org>
Cc: tarantool-patches@freelists.org
Subject: Re: [RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version
Date: Wed, 11 Jul 2018 19:33:07 +0300	[thread overview]
Message-ID: <20180711163307.vxwzle6claiazwrg@esperanza> (raw)
In-Reply-To: <20180710164343.xgxdwk4ovbhdmmbo@esperanza>

BTW this patch makes the behavior of vy_point_lookup() consistent with
vy_read_iterator: both iterators now return the newest tuple version.

Also, it allows to simplify vy_squash_process() - see the patch below
(I pushed it on the branch, you may want to cherry-pick it as well):

From c001e3f1a320bb804f49ca19ec540fee094dacc3 Mon Sep 17 00:00:00 2001
From: Vladimir Davydov <vdavydov.dev@gmail.com>
Date: Wed, 11 Jul 2018 19:14:24 +0300
Subject: [PATCH] vinyl: simplify vy_squash_process

Since vy_point_lookup() now guarantees that it returns the newest
tuple version, we can remove the code that squashes UPSERTs from
vy_squash_process().

diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index a9603560..2d1a6fc0 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -3585,11 +3585,6 @@ vy_squash_process(struct vy_squash *squash)
 
 	struct vy_lsm *lsm = squash->lsm;
 	struct vy_env *env = squash->env;
-	/*
-	 * vy_apply_upsert() is used for primary key only,
-	 * so this is the same as lsm->key_def
-	 */
-	struct key_def *def = lsm->cmp_def;
 
 	/* Upserts enabled only in the primary index LSM tree. */
 	assert(lsm->index_id == 0);
@@ -3607,8 +3602,10 @@ vy_squash_process(struct vy_squash *squash)
 
 	/*
 	 * While we were reading on-disk runs, new statements could
-	 * have been inserted into the in-memory tree. Apply them to
-	 * the result.
+	 * have been prepared for the squashed key. We mustn't apply
+	 * them, because they may be rolled back, but we must adjust
+	 * their n_upserts counter so that they will get squashed by
+	 * vy_lsm_commit_upsert().
 	 */
 	struct vy_mem *mem = lsm->mem;
 	struct tree_mem_key tree_key = {
@@ -3625,108 +3622,20 @@ vy_squash_process(struct vy_squash *squash)
 		tuple_unref(result);
 		return 0;
 	}
-	/**
-	 * Algorithm of the squashing.
-	 * Assume, during building the non-UPSERT statement
-	 * 'result' in the mem some new UPSERTs were inserted, and
-	 * some of them were commited, while the other were just
-	 * prepared. And lets UPSERT_THRESHOLD to be equal to 3,
-	 * for example.
-	 *                    Mem
-	 *    -------------------------------------+
-	 *    UPSERT, lsn = 1, n_ups = 0           |
-	 *    UPSERT, lsn = 2, n_ups = 1           | Commited
-	 *    UPSERT, lsn = 3, n_ups = 2           |
-	 *    -------------------------------------+
-	 *    UPSERT, lsn = MAX,     n_ups = 3     |
-	 *    UPSERT, lsn = MAX + 1, n_ups = 4     | Prepared
-	 *    UPSERT, lsn = MAX + 2, n_ups = 5     |
-	 *    -------------------------------------+
-	 * In such a case the UPSERT statements with
-	 * lsns = {1, 2, 3} are squashed. But now the n_upsert
-	 * values in the prepared statements are not correct.
-	 * If we will not update values, then the
-	 * vy_lsm_commit_upsert will not be able to squash them.
-	 *
-	 * So after squashing it is necessary to update n_upsert
-	 * value in the prepared statements:
-	 *                    Mem
-	 *    -------------------------------------+
-	 *    UPSERT, lsn = 1, n_ups = 0           |
-	 *    UPSERT, lsn = 2, n_ups = 1           | Commited
-	 *    REPLACE, lsn = 3                     |
-	 *    -------------------------------------+
-	 *    UPSERT, lsn = MAX,     n_ups = 0 !!! |
-	 *    UPSERT, lsn = MAX + 1, n_ups = 1 !!! | Prepared
-	 *    UPSERT, lsn = MAX + 2, n_ups = 2 !!! |
-	 *    -------------------------------------+
-	 */
 	vy_mem_tree_iterator_prev(&mem->tree, &mem_itr);
-	const struct tuple *mem_stmt;
-	int64_t stmt_lsn;
-	/*
-	 * According to the described algorithm, squash the
-	 * commited UPSERTs at first.
-	 */
+	uint8_t n_upserts = 0;
 	while (!vy_mem_tree_iterator_is_invalid(&mem_itr)) {
+		const struct tuple *mem_stmt;
 		mem_stmt = *vy_mem_tree_iterator_get_elem(&mem->tree, &mem_itr);
-		stmt_lsn = vy_stmt_lsn(mem_stmt);
-		if (vy_tuple_compare(result, mem_stmt, def) != 0)
-			break;
-		/**
-		 * Leave alone prepared statements; they will be handled
-		 * in vy_range_commit_stmt.
-		 */
-		if (stmt_lsn >= MAX_LSN)
+		if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
+		    vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
 			break;
-		if (vy_stmt_type(mem_stmt) != IPROTO_UPSERT) {
-			/**
-			 * Somebody inserted non-upsert statement,
-			 * squashing is useless.
-			 */
-			tuple_unref(result);
-			return 0;
-		}
-		assert(lsm->index_id == 0);
-		struct tuple *applied = vy_apply_upsert(mem_stmt, result, def,
-							mem->format, true);
-		lsm->stat.upsert.applied++;
-		tuple_unref(result);
-		if (applied == NULL)
-			return -1;
-		result = applied;
-		/**
-		 * In normal cases we get a result with the same lsn as
-		 * in mem_stmt.
-		 * But if there are buggy upserts that do wrong things,
-		 * they are ignored and the result has lower lsn.
-		 * We should fix the lsn in any case to replace
-		 * exactly mem_stmt in general and the buggy upsert
-		 * in particular.
-		 */
-		vy_stmt_set_lsn(result, stmt_lsn);
+		assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
+		vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
+		if (n_upserts <= VY_UPSERT_THRESHOLD)
+			++n_upserts;
 		vy_mem_tree_iterator_prev(&mem->tree, &mem_itr);
 	}
-	/*
-	 * The second step of the algorithm above is updating of
-	 * n_upsert values of the prepared UPSERTs.
-	 */
-	if (stmt_lsn >= MAX_LSN) {
-		uint8_t n_upserts = 0;
-		while (!vy_mem_tree_iterator_is_invalid(&mem_itr)) {
-			mem_stmt = *vy_mem_tree_iterator_get_elem(&mem->tree,
-								  &mem_itr);
-			if (vy_tuple_compare(result, mem_stmt, def) != 0 ||
-			    vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
-				break;
-			assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
-			vy_stmt_set_n_upserts((struct tuple *)mem_stmt,
-					      n_upserts);
-			if (n_upserts <= VY_UPSERT_THRESHOLD)
-				++n_upserts;
-			vy_mem_tree_iterator_prev(&mem->tree, &mem_itr);
-		}
-	}
 
 	lsm->stat.upsert.squashed++;
 

  reply	other threads:[~2018-07-11 16:33 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-08 16:48 [RFC PATCH 02/23] vinyl: always get full tuple from pk after reading from secondary index Vladimir Davydov
2018-07-08 16:48 ` [RFC PATCH 00/23] vinyl: eliminate read on REPLACE/DELETE Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 01/23] vinyl: do not turn REPLACE into INSERT when processing DML request Vladimir Davydov
2018-07-10 12:15     ` Konstantin Osipov
2018-07-10 12:19       ` Vladimir Davydov
2018-07-10 18:39         ` Konstantin Osipov
2018-07-11  7:57           ` Vladimir Davydov
2018-07-11 10:25             ` Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 03/23] vinyl: use vy_mem_iterator for point lookup Vladimir Davydov
2018-07-17 10:14     ` Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version Vladimir Davydov
2018-07-10 16:19     ` Konstantin Osipov
2018-07-10 16:43       ` Vladimir Davydov
2018-07-11 16:33         ` Vladimir Davydov [this message]
2018-07-31 19:17           ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 05/23] vinyl: fold vy_replace_one and vy_replace_impl Vladimir Davydov
2018-07-31 20:28     ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 06/23] vinyl: fold vy_delete_impl Vladimir Davydov
2018-07-31 20:28     ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 07/23] vinyl: refactor unique check Vladimir Davydov
2018-07-31 20:28     ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 08/23] vinyl: check key uniqueness before modifying tx write set Vladimir Davydov
2018-07-31 20:34     ` Konstantin Osipov
2018-08-01 10:42       ` Vladimir Davydov
2018-08-09 20:26     ` Konstantin Osipov
2018-08-10  8:26       ` Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 09/23] vinyl: remove env argument of vy_check_is_unique_{primary,secondary} Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 10/23] vinyl: store full tuples in secondary index cache Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 11/23] xrow: allow to store flags in DML requests Vladimir Davydov
2018-07-31 20:36     ` Konstantin Osipov
2018-08-01 14:10       ` Vladimir Davydov
2018-08-17 13:34         ` Vladimir Davydov
2018-08-17 13:34           ` [PATCH 1/2] xrow: allow to store tuple metadata in request Vladimir Davydov
2018-08-17 13:34           ` [PATCH 2/2] vinyl: introduce statement flags Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 12/23] vinyl: do not pass region explicitly to write iterator functions Vladimir Davydov
2018-07-17 10:16     ` Vladimir Davydov
2018-07-31 20:38     ` Konstantin Osipov
2018-08-01 14:14       ` Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 13/23] vinyl: fix potential use-after-free in vy_read_view_merge Vladimir Davydov
2018-07-17 10:16     ` Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 14/23] test: unit/vy_write_iterator: minor refactoring Vladimir Davydov
2018-07-17 10:17     ` Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 15/23] vinyl: teach write iterator to return overwritten tuples Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 16/23] vinyl: allow to skip certain statements on read Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 17/23] vinyl: do not free pending tasks on shutdown Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 18/23] vinyl: store pointer to scheduler in struct vy_task Vladimir Davydov
2018-07-31 20:39     ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 19/23] vinyl: rename some members of vy_scheduler and vy_task struct Vladimir Davydov
2018-07-31 20:40     ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 20/23] vinyl: use cbus for communication between scheduler and worker threads Vladimir Davydov
2018-07-31 20:43     ` Konstantin Osipov
2018-08-01 14:26       ` Vladimir Davydov
2018-07-08 16:48   ` [RFC PATCH 21/23] vinyl: zap vy_scheduler::is_worker_pool_running Vladimir Davydov
2018-07-31 20:43     ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 22/23] vinyl: rename vy_task::status to is_failed Vladimir Davydov
2018-07-31 20:44     ` Konstantin Osipov
2018-07-08 16:48   ` [RFC PATCH 23/23] vinyl: eliminate read on REPLACE/DELETE Vladimir Davydov
2018-07-13 10:53     ` Vladimir Davydov
2018-07-13 10:53       ` [PATCH 1/3] stailq: add stailq_insert function Vladimir Davydov
2018-07-15  7:02         ` Konstantin Osipov
2018-07-15 13:17           ` Vladimir Davydov
2018-07-15 18:40             ` Konstantin Osipov
2018-07-17 10:18         ` Vladimir Davydov
2018-07-13 10:53       ` [PATCH 2/3] vinyl: link all indexes of the same space Vladimir Davydov
2018-07-13 10:53       ` [PATCH 3/3] vinyl: generate deferred DELETEs on tx commit Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180711163307.vxwzle6claiazwrg@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox