From: Vladimir Davydov <vdavydov.dev@gmail.com> To: kostja@tarantool.org Cc: tarantool-patches@freelists.org Subject: [RFC PATCH 16/23] vinyl: allow to skip certain statements on read Date: Sun, 8 Jul 2018 19:48:47 +0300 [thread overview] Message-ID: <d37735894882dc90433d9e43141d78890b9b085e.1531065648.git.vdavydov.dev@gmail.com> (raw) In-Reply-To: <cover.1531065648.git.vdavydov.dev@gmail.com> In-Reply-To: <cover.1531065648.git.vdavydov.dev@gmail.com> In the scope of #2129 we will defer insertion of certain DELETE statements into secondary indexes until primary index compaction. However, by the time we invoke compaction, new statements might have been inserted into the space for the same set of keys. If that happens, insertion of a deferred DELETE will break the invariant which the read iterator relies upon: that for any key older sources store older statements. To avoid that, let's add a new per statement flag, VY_STMT_SKIP_READ, and make the read iterator ignore statements marked with it. Needed for #2129 --- src/box/vy_mem.c | 19 ++++++++++++------- src/box/vy_run.c | 7 ++++++- src/box/vy_stmt.h | 10 ++++++++++ 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/src/box/vy_mem.c b/src/box/vy_mem.c index 7c9690ef..dadd73cb 100644 --- a/src/box/vy_mem.c +++ b/src/box/vy_mem.c @@ -323,7 +323,8 @@ vy_mem_iterator_find_lsn(struct vy_mem_iterator *itr, assert(!vy_mem_tree_iterator_is_invalid(&itr->curr_pos)); assert(itr->curr_stmt == vy_mem_iterator_curr_stmt(itr)); const struct key_def *cmp_def = itr->mem->cmp_def; - while (vy_stmt_lsn(itr->curr_stmt) > (**itr->read_view).vlsn) { + while (vy_stmt_lsn(itr->curr_stmt) > (**itr->read_view).vlsn || + vy_stmt_flags(itr->curr_stmt) & VY_STMT_SKIP_READ) { if (vy_mem_iterator_step(itr, iterator_type) != 0 || (iterator_type == ITER_EQ && vy_stmt_compare(key, itr->curr_stmt, cmp_def))) { @@ -340,6 +341,7 @@ vy_mem_iterator_find_lsn(struct vy_mem_iterator *itr, *vy_mem_tree_iterator_get_elem(&itr->mem->tree, &prev_pos); if (vy_stmt_lsn(prev_stmt) > (**itr->read_view).vlsn || + vy_stmt_flags(prev_stmt) & VY_STMT_SKIP_READ || vy_tuple_compare(itr->curr_stmt, prev_stmt, cmp_def) != 0) break; @@ -495,18 +497,21 @@ vy_mem_iterator_next_lsn(struct vy_mem_iterator *itr) const struct key_def *cmp_def = itr->mem->cmp_def; struct vy_mem_tree_iterator next_pos = itr->curr_pos; +next: vy_mem_tree_iterator_next(&itr->mem->tree, &next_pos); if (vy_mem_tree_iterator_is_invalid(&next_pos)) return 1; /* EOF */ const struct tuple *next_stmt; next_stmt = *vy_mem_tree_iterator_get_elem(&itr->mem->tree, &next_pos); - if (vy_tuple_compare(itr->curr_stmt, next_stmt, cmp_def) == 0) { - itr->curr_pos = next_pos; - itr->curr_stmt = next_stmt; - return 0; - } - return 1; + if (vy_tuple_compare(itr->curr_stmt, next_stmt, cmp_def) != 0) + return 1; + + itr->curr_pos = next_pos; + itr->curr_stmt = next_stmt; + if (vy_stmt_flags(itr->curr_stmt) & VY_STMT_SKIP_READ) + goto next; + return 0; } /** diff --git a/src/box/vy_run.c b/src/box/vy_run.c index dc837c2b..6f7fb82a 100644 --- a/src/box/vy_run.c +++ b/src/box/vy_run.c @@ -1157,7 +1157,8 @@ vy_run_iterator_find_lsn(struct vy_run_iterator *itr, assert(itr->curr_stmt != NULL); assert(itr->curr_pos.page_no < slice->run->info.page_count); - while (vy_stmt_lsn(itr->curr_stmt) > (**itr->read_view).vlsn) { + while (vy_stmt_lsn(itr->curr_stmt) > (**itr->read_view).vlsn || + vy_stmt_flags(itr->curr_stmt) & VY_STMT_SKIP_READ) { if (vy_run_iterator_next_pos(itr, iterator_type, &itr->curr_pos) != 0) { vy_run_iterator_stop(itr); @@ -1183,6 +1184,7 @@ vy_run_iterator_find_lsn(struct vy_run_iterator *itr, &test_stmt) != 0) return -1; if (vy_stmt_lsn(test_stmt) > (**itr->read_view).vlsn || + vy_stmt_flags(test_stmt) & VY_STMT_SKIP_READ || vy_tuple_compare(itr->curr_stmt, test_stmt, cmp_def) != 0) { tuple_unref(test_stmt); @@ -1478,6 +1480,7 @@ vy_run_iterator_next_lsn(struct vy_run_iterator *itr, struct tuple **ret) assert(itr->curr_pos.page_no < itr->slice->run->info.page_count); struct vy_run_iterator_pos next_pos; +next: if (vy_run_iterator_next_pos(itr, ITER_GE, &next_pos) != 0) { vy_run_iterator_stop(itr); return 0; @@ -1495,6 +1498,8 @@ vy_run_iterator_next_lsn(struct vy_run_iterator *itr, struct tuple **ret) tuple_unref(itr->curr_stmt); itr->curr_stmt = next_key; itr->curr_pos = next_pos; + if (vy_stmt_flags(itr->curr_stmt) & VY_STMT_SKIP_READ) + goto next; vy_stmt_counter_acct_tuple(&itr->stat->get, itr->curr_stmt); *ret = itr->curr_stmt; diff --git a/src/box/vy_stmt.h b/src/box/vy_stmt.h index 8de8aa84..878a27f7 100644 --- a/src/box/vy_stmt.h +++ b/src/box/vy_stmt.h @@ -87,6 +87,16 @@ enum { * DELETE statements for them during compaction. */ VY_STMT_DEFERRED_DELETE = 1 << 0, + /** + * Statements that have this flag set are ignored by the + * read iterator. + * + * We set this flag for deferred DELETE statements, because + * they may violate the invariant which the read relies upon: + * the older a source, the older statements it stores for a + * particular key. + */ + VY_STMT_SKIP_READ = 1 << 1, }; /** -- 2.11.0
next prev parent reply other threads:[~2018-07-08 16:48 UTC|newest] Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-07-08 16:48 [RFC PATCH 02/23] vinyl: always get full tuple from pk after reading from secondary index Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 00/23] vinyl: eliminate read on REPLACE/DELETE Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 01/23] vinyl: do not turn REPLACE into INSERT when processing DML request Vladimir Davydov 2018-07-10 12:15 ` Konstantin Osipov 2018-07-10 12:19 ` Vladimir Davydov 2018-07-10 18:39 ` Konstantin Osipov 2018-07-11 7:57 ` Vladimir Davydov 2018-07-11 10:25 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 03/23] vinyl: use vy_mem_iterator for point lookup Vladimir Davydov 2018-07-17 10:14 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version Vladimir Davydov 2018-07-10 16:19 ` Konstantin Osipov 2018-07-10 16:43 ` Vladimir Davydov 2018-07-11 16:33 ` Vladimir Davydov 2018-07-31 19:17 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 05/23] vinyl: fold vy_replace_one and vy_replace_impl Vladimir Davydov 2018-07-31 20:28 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 06/23] vinyl: fold vy_delete_impl Vladimir Davydov 2018-07-31 20:28 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 07/23] vinyl: refactor unique check Vladimir Davydov 2018-07-31 20:28 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 08/23] vinyl: check key uniqueness before modifying tx write set Vladimir Davydov 2018-07-31 20:34 ` Konstantin Osipov 2018-08-01 10:42 ` Vladimir Davydov 2018-08-09 20:26 ` Konstantin Osipov 2018-08-10 8:26 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 09/23] vinyl: remove env argument of vy_check_is_unique_{primary,secondary} Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 10/23] vinyl: store full tuples in secondary index cache Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 11/23] xrow: allow to store flags in DML requests Vladimir Davydov 2018-07-31 20:36 ` Konstantin Osipov 2018-08-01 14:10 ` Vladimir Davydov 2018-08-17 13:34 ` Vladimir Davydov 2018-08-17 13:34 ` [PATCH 1/2] xrow: allow to store tuple metadata in request Vladimir Davydov 2018-08-17 13:34 ` [PATCH 2/2] vinyl: introduce statement flags Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 12/23] vinyl: do not pass region explicitly to write iterator functions Vladimir Davydov 2018-07-17 10:16 ` Vladimir Davydov 2018-07-31 20:38 ` Konstantin Osipov 2018-08-01 14:14 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 13/23] vinyl: fix potential use-after-free in vy_read_view_merge Vladimir Davydov 2018-07-17 10:16 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 14/23] test: unit/vy_write_iterator: minor refactoring Vladimir Davydov 2018-07-17 10:17 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 15/23] vinyl: teach write iterator to return overwritten tuples Vladimir Davydov 2018-07-08 16:48 ` Vladimir Davydov [this message] 2018-07-08 16:48 ` [RFC PATCH 17/23] vinyl: do not free pending tasks on shutdown Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 18/23] vinyl: store pointer to scheduler in struct vy_task Vladimir Davydov 2018-07-31 20:39 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 19/23] vinyl: rename some members of vy_scheduler and vy_task struct Vladimir Davydov 2018-07-31 20:40 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 20/23] vinyl: use cbus for communication between scheduler and worker threads Vladimir Davydov 2018-07-31 20:43 ` Konstantin Osipov 2018-08-01 14:26 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 21/23] vinyl: zap vy_scheduler::is_worker_pool_running Vladimir Davydov 2018-07-31 20:43 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 22/23] vinyl: rename vy_task::status to is_failed Vladimir Davydov 2018-07-31 20:44 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 23/23] vinyl: eliminate read on REPLACE/DELETE Vladimir Davydov 2018-07-13 10:53 ` Vladimir Davydov 2018-07-13 10:53 ` [PATCH 1/3] stailq: add stailq_insert function Vladimir Davydov 2018-07-15 7:02 ` Konstantin Osipov 2018-07-15 13:17 ` Vladimir Davydov 2018-07-15 18:40 ` Konstantin Osipov 2018-07-17 10:18 ` Vladimir Davydov 2018-07-13 10:53 ` [PATCH 2/3] vinyl: link all indexes of the same space Vladimir Davydov 2018-07-13 10:53 ` [PATCH 3/3] vinyl: generate deferred DELETEs on tx commit Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=d37735894882dc90433d9e43141d78890b9b085e.1531065648.git.vdavydov.dev@gmail.com \ --to=vdavydov.dev@gmail.com \ --cc=kostja@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: [RFC PATCH 16/23] vinyl: allow to skip certain statements on read' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox