From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [PATCH 6/6] vinyl: simplify read iterator restoration behavior Date: Tue, 26 Mar 2019 18:50:34 +0300 Message-Id: <208f29dcc3ca58f2acd4601195402c7dadbbb1fe.1553613748.git.vdavydov.dev@gmail.com> In-Reply-To: References: In-Reply-To: References: To: tarantool-patches@freelists.org List-ID: The 'restore' method, which is implemented by txw, cache, and memory sources, restores an iterator position after a yield in case the source has changed. It is passed the last key and it is supposed to position the iterator to the statement following the last key provided it had to reposition the iterator at all. For all kinds of sources, it first checks the source version and bails out early if it hasn't changed since the last time the iterator was used. If it has changed, it either looks up the statement following the given last key or tries to advance the iterator statement by statement (in case of a cache iterator, to avoid extra cache lookups). Currently, the method returns 1 if the iterator position has actually changed as a result of the call, 0 otherwise. That is, it returns 0 even if the version has changed and it had to perform a full lookup, but landed on the same statement. This is confusing, because in this case the caller has to check if it has to advance the iterator even though it doesn't need to, as the iterator is already positioned at the next key - see vy_read_iterator_scan_* family of functions. Let's simplify the restoration rules - make the 'restore' method return 1 if it had to restore the iterator position irrespective of whether the position has actually changed or not. This means that in case the method returns 1, the caller knows that the iterator is already positioned properly and doesn't need to be advanced. If it returns 0, it means that the iterator is positioned at the same statement as before and we need to check if we need to advance it. This simplifies the iterator code by removing position checking, which would turn real nasty once multikey indexes are introduced. Note, comments to 'restore' methods already conform to this behavior (they weren't quite correct before this patch). There's a catch though - vy_read_iterator_restore_mem relies on the old behavior to skip the cache source in case the current key gets updated during yield. However, it's easy to fix that without introducing any extra overhead - we just need to check if the cache source is at the front of the iterator and clear its history if it is. BTW it turned out that this behavior wasn't covered by tests - when I removed the line invalidating the cache source from vy_read_iterator_restore_mem, all tests passed as usual. So while we are at it, let's also add a test case covering this piece of code. --- src/box/vy_cache.c | 17 ++++++++++------- src/box/vy_mem.c | 3 --- src/box/vy_read_iterator.c | 4 +++- src/box/vy_tx.c | 3 --- test/vinyl/stat.result | 4 ++-- test/vinyl/upsert.result | 44 ++++++++++++++++++++++++++++++++++++++++++++ test/vinyl/upsert.test.lua | 18 ++++++++++++++++++ 7 files changed, 77 insertions(+), 16 deletions(-) diff --git a/src/box/vy_cache.c b/src/box/vy_cache.c index 825bfd3b..e526ab1e 100644 --- a/src/box/vy_cache.c +++ b/src/box/vy_cache.c @@ -766,11 +766,11 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, if (!itr->search_started || itr->version == itr->cache->version) return 0; + bool pos_changed = false; itr->version = itr->cache->version; - struct tuple *prev_stmt = itr->curr_stmt; - if ((prev_stmt == NULL && itr->iterator_type == ITER_EQ) || - (prev_stmt != NULL && - prev_stmt != vy_cache_iterator_curr_stmt(itr))) { + if ((itr->curr_stmt == NULL && itr->iterator_type == ITER_EQ) || + (itr->curr_stmt != NULL && + itr->curr_stmt != vy_cache_iterator_curr_stmt(itr))) { /* * EQ search ended or the iterator was invalidated. * In either case the best we can do is restart the @@ -778,6 +778,7 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, */ *stop = vy_cache_iterator_seek(itr, last_stmt); vy_cache_iterator_skip_to_read_view(itr, stop); + pos_changed = true; } else { /* * The iterator position is still valid, but new @@ -797,7 +798,7 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, struct key_def *def = itr->cache->cmp_def; struct vy_cache_tree *tree = &itr->cache->cache_tree; struct vy_cache_tree_iterator pos = itr->curr_pos; - if (prev_stmt == NULL) + if (itr->curr_stmt == NULL) pos = vy_cache_tree_invalid_iterator(); while (true) { if (dir > 0) @@ -818,11 +819,14 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, itr->curr_stmt = entry->stmt; tuple_ref(itr->curr_stmt); *stop = vy_cache_iterator_is_stop(itr, entry); + pos_changed = true; } if (cmp == 0) break; } } + if (!pos_changed) + return 0; vy_history_cleanup(history); if (itr->curr_stmt != NULL) { @@ -830,9 +834,8 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, itr->curr_stmt); if (vy_history_append_stmt(history, itr->curr_stmt) != 0) return -1; - return prev_stmt != itr->curr_stmt; } - return 0; + return 1; } void diff --git a/src/box/vy_mem.c b/src/box/vy_mem.c index 7fa90cbd..e3b40083 100644 --- a/src/box/vy_mem.c +++ b/src/box/vy_mem.c @@ -595,10 +595,7 @@ vy_mem_iterator_restore(struct vy_mem_iterator *itr, if (!itr->search_started || itr->version == itr->mem->version) return 0; - const struct tuple *prev_stmt = itr->curr_stmt; vy_mem_iterator_seek(itr, last_stmt); - if (prev_stmt == itr->curr_stmt) - return 0; vy_history_cleanup(history); if (itr->curr_stmt != NULL && diff --git a/src/box/vy_read_iterator.c b/src/box/vy_read_iterator.c index 2864a280..ac8a410d 100644 --- a/src/box/vy_read_iterator.c +++ b/src/box/vy_read_iterator.c @@ -424,7 +424,9 @@ vy_read_iterator_restore_mem(struct vy_read_iterator *itr, * Make sure we don't read the old value * from the cache while applying UPSERTs. */ - itr->src[itr->cache_src].front_id = 0; + struct vy_read_src *cache_src = &itr->src[itr->cache_src]; + if (cache_src->front_id == itr->front_id) + vy_history_cleanup(&cache_src->history); } src->front_id = itr->front_id; return 0; diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c index 02f5ca27..bbb41df5 100644 --- a/src/box/vy_tx.c +++ b/src/box/vy_tx.c @@ -1247,10 +1247,7 @@ vy_txw_iterator_restore(struct vy_txw_iterator *itr, if (!itr->search_started || itr->version == itr->tx->write_set_version) return 0; - struct txv *prev_txv = itr->curr_txv; vy_txw_iterator_seek(itr, last_stmt); - if (prev_txv == itr->curr_txv) - return 0; vy_history_cleanup(history); if (itr->curr_txv != NULL) { diff --git a/test/vinyl/stat.result b/test/vinyl/stat.result index 08da1658..ff73d42a 100644 --- a/test/vinyl/stat.result +++ b/test/vinyl/stat.result @@ -716,8 +716,8 @@ stat_diff(istat(), st) rows: 5 bytes: 5305 get: - rows: 9 - bytes: 9549 + rows: 5 + bytes: 5305 txw: iterator: lookup: 1 diff --git a/test/vinyl/upsert.result b/test/vinyl/upsert.result index 6c6d03d6..c807b5b5 100644 --- a/test/vinyl/upsert.result +++ b/test/vinyl/upsert.result @@ -791,3 +791,47 @@ s:select({100}, 'GE') s:drop() --- ... +-- +-- Check that read iterator ignores a cached statement in case +-- the current key is updated while it yields on disk read. +-- +fiber = require('fiber') +--- +... +s = box.schema.space.create('test', {engine = 'vinyl'}) +--- +... +_ = s:create_index('pk') +--- +... +s:replace{10, 10} +--- +- [10, 10] +... +box.snapshot() +--- +- ok +... +s:get(10) -- add [10, 10] to the cache +--- +- [10, 10] +... +ch = fiber.channel(1) +--- +... +test_run:cmd("setopt delimiter ';'") +--- +- true +... +_ = fiber.create(function() ch:put(s:select()) end) +s:upsert({10, 10}, {{'+', 2, 10}}) +test_run:cmd("setopt delimiter ''"); +--- +... +ch:get() -- should see the UPSERT and return [10, 20] +--- +- - [10, 20] +... +s:drop() +--- +... diff --git a/test/vinyl/upsert.test.lua b/test/vinyl/upsert.test.lua index ef83e0f6..cf287360 100644 --- a/test/vinyl/upsert.test.lua +++ b/test/vinyl/upsert.test.lua @@ -325,3 +325,21 @@ s:upsert({100}, {{'+', 2, 100}}) s:select({100}, 'GE') s:drop() + +-- +-- Check that read iterator ignores a cached statement in case +-- the current key is updated while it yields on disk read. +-- +fiber = require('fiber') +s = box.schema.space.create('test', {engine = 'vinyl'}) +_ = s:create_index('pk') +s:replace{10, 10} +box.snapshot() +s:get(10) -- add [10, 10] to the cache +ch = fiber.channel(1) +test_run:cmd("setopt delimiter ';'") +_ = fiber.create(function() ch:put(s:select()) end) +s:upsert({10, 10}, {{'+', 2, 10}}) +test_run:cmd("setopt delimiter ''"); +ch:get() -- should see the UPSERT and return [10, 20] +s:drop() -- 2.11.0