From: Vladimir Davydov <vdavydov.dev@gmail.com> To: tarantool-patches@freelists.org Subject: [PATCH 6/6] vinyl: simplify read iterator restoration behavior Date: Tue, 26 Mar 2019 18:50:34 +0300 [thread overview] Message-ID: <208f29dcc3ca58f2acd4601195402c7dadbbb1fe.1553613748.git.vdavydov.dev@gmail.com> (raw) In-Reply-To: <cover.1553613748.git.vdavydov.dev@gmail.com> In-Reply-To: <cover.1553613748.git.vdavydov.dev@gmail.com> The 'restore' method, which is implemented by txw, cache, and memory sources, restores an iterator position after a yield in case the source has changed. It is passed the last key and it is supposed to position the iterator to the statement following the last key provided it had to reposition the iterator at all. For all kinds of sources, it first checks the source version and bails out early if it hasn't changed since the last time the iterator was used. If it has changed, it either looks up the statement following the given last key or tries to advance the iterator statement by statement (in case of a cache iterator, to avoid extra cache lookups). Currently, the method returns 1 if the iterator position has actually changed as a result of the call, 0 otherwise. That is, it returns 0 even if the version has changed and it had to perform a full lookup, but landed on the same statement. This is confusing, because in this case the caller has to check if it has to advance the iterator even though it doesn't need to, as the iterator is already positioned at the next key - see vy_read_iterator_scan_* family of functions. Let's simplify the restoration rules - make the 'restore' method return 1 if it had to restore the iterator position irrespective of whether the position has actually changed or not. This means that in case the method returns 1, the caller knows that the iterator is already positioned properly and doesn't need to be advanced. If it returns 0, it means that the iterator is positioned at the same statement as before and we need to check if we need to advance it. This simplifies the iterator code by removing position checking, which would turn real nasty once multikey indexes are introduced. Note, comments to 'restore' methods already conform to this behavior (they weren't quite correct before this patch). There's a catch though - vy_read_iterator_restore_mem relies on the old behavior to skip the cache source in case the current key gets updated during yield. However, it's easy to fix that without introducing any extra overhead - we just need to check if the cache source is at the front of the iterator and clear its history if it is. BTW it turned out that this behavior wasn't covered by tests - when I removed the line invalidating the cache source from vy_read_iterator_restore_mem, all tests passed as usual. So while we are at it, let's also add a test case covering this piece of code. --- src/box/vy_cache.c | 17 ++++++++++------- src/box/vy_mem.c | 3 --- src/box/vy_read_iterator.c | 4 +++- src/box/vy_tx.c | 3 --- test/vinyl/stat.result | 4 ++-- test/vinyl/upsert.result | 44 ++++++++++++++++++++++++++++++++++++++++++++ test/vinyl/upsert.test.lua | 18 ++++++++++++++++++ 7 files changed, 77 insertions(+), 16 deletions(-) diff --git a/src/box/vy_cache.c b/src/box/vy_cache.c index 825bfd3b..e526ab1e 100644 --- a/src/box/vy_cache.c +++ b/src/box/vy_cache.c @@ -766,11 +766,11 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, if (!itr->search_started || itr->version == itr->cache->version) return 0; + bool pos_changed = false; itr->version = itr->cache->version; - struct tuple *prev_stmt = itr->curr_stmt; - if ((prev_stmt == NULL && itr->iterator_type == ITER_EQ) || - (prev_stmt != NULL && - prev_stmt != vy_cache_iterator_curr_stmt(itr))) { + if ((itr->curr_stmt == NULL && itr->iterator_type == ITER_EQ) || + (itr->curr_stmt != NULL && + itr->curr_stmt != vy_cache_iterator_curr_stmt(itr))) { /* * EQ search ended or the iterator was invalidated. * In either case the best we can do is restart the @@ -778,6 +778,7 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, */ *stop = vy_cache_iterator_seek(itr, last_stmt); vy_cache_iterator_skip_to_read_view(itr, stop); + pos_changed = true; } else { /* * The iterator position is still valid, but new @@ -797,7 +798,7 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, struct key_def *def = itr->cache->cmp_def; struct vy_cache_tree *tree = &itr->cache->cache_tree; struct vy_cache_tree_iterator pos = itr->curr_pos; - if (prev_stmt == NULL) + if (itr->curr_stmt == NULL) pos = vy_cache_tree_invalid_iterator(); while (true) { if (dir > 0) @@ -818,11 +819,14 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, itr->curr_stmt = entry->stmt; tuple_ref(itr->curr_stmt); *stop = vy_cache_iterator_is_stop(itr, entry); + pos_changed = true; } if (cmp == 0) break; } } + if (!pos_changed) + return 0; vy_history_cleanup(history); if (itr->curr_stmt != NULL) { @@ -830,9 +834,8 @@ vy_cache_iterator_restore(struct vy_cache_iterator *itr, itr->curr_stmt); if (vy_history_append_stmt(history, itr->curr_stmt) != 0) return -1; - return prev_stmt != itr->curr_stmt; } - return 0; + return 1; } void diff --git a/src/box/vy_mem.c b/src/box/vy_mem.c index 7fa90cbd..e3b40083 100644 --- a/src/box/vy_mem.c +++ b/src/box/vy_mem.c @@ -595,10 +595,7 @@ vy_mem_iterator_restore(struct vy_mem_iterator *itr, if (!itr->search_started || itr->version == itr->mem->version) return 0; - const struct tuple *prev_stmt = itr->curr_stmt; vy_mem_iterator_seek(itr, last_stmt); - if (prev_stmt == itr->curr_stmt) - return 0; vy_history_cleanup(history); if (itr->curr_stmt != NULL && diff --git a/src/box/vy_read_iterator.c b/src/box/vy_read_iterator.c index 2864a280..ac8a410d 100644 --- a/src/box/vy_read_iterator.c +++ b/src/box/vy_read_iterator.c @@ -424,7 +424,9 @@ vy_read_iterator_restore_mem(struct vy_read_iterator *itr, * Make sure we don't read the old value * from the cache while applying UPSERTs. */ - itr->src[itr->cache_src].front_id = 0; + struct vy_read_src *cache_src = &itr->src[itr->cache_src]; + if (cache_src->front_id == itr->front_id) + vy_history_cleanup(&cache_src->history); } src->front_id = itr->front_id; return 0; diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c index 02f5ca27..bbb41df5 100644 --- a/src/box/vy_tx.c +++ b/src/box/vy_tx.c @@ -1247,10 +1247,7 @@ vy_txw_iterator_restore(struct vy_txw_iterator *itr, if (!itr->search_started || itr->version == itr->tx->write_set_version) return 0; - struct txv *prev_txv = itr->curr_txv; vy_txw_iterator_seek(itr, last_stmt); - if (prev_txv == itr->curr_txv) - return 0; vy_history_cleanup(history); if (itr->curr_txv != NULL) { diff --git a/test/vinyl/stat.result b/test/vinyl/stat.result index 08da1658..ff73d42a 100644 --- a/test/vinyl/stat.result +++ b/test/vinyl/stat.result @@ -716,8 +716,8 @@ stat_diff(istat(), st) rows: 5 bytes: 5305 get: - rows: 9 - bytes: 9549 + rows: 5 + bytes: 5305 txw: iterator: lookup: 1 diff --git a/test/vinyl/upsert.result b/test/vinyl/upsert.result index 6c6d03d6..c807b5b5 100644 --- a/test/vinyl/upsert.result +++ b/test/vinyl/upsert.result @@ -791,3 +791,47 @@ s:select({100}, 'GE') s:drop() --- ... +-- +-- Check that read iterator ignores a cached statement in case +-- the current key is updated while it yields on disk read. +-- +fiber = require('fiber') +--- +... +s = box.schema.space.create('test', {engine = 'vinyl'}) +--- +... +_ = s:create_index('pk') +--- +... +s:replace{10, 10} +--- +- [10, 10] +... +box.snapshot() +--- +- ok +... +s:get(10) -- add [10, 10] to the cache +--- +- [10, 10] +... +ch = fiber.channel(1) +--- +... +test_run:cmd("setopt delimiter ';'") +--- +- true +... +_ = fiber.create(function() ch:put(s:select()) end) +s:upsert({10, 10}, {{'+', 2, 10}}) +test_run:cmd("setopt delimiter ''"); +--- +... +ch:get() -- should see the UPSERT and return [10, 20] +--- +- - [10, 20] +... +s:drop() +--- +... diff --git a/test/vinyl/upsert.test.lua b/test/vinyl/upsert.test.lua index ef83e0f6..cf287360 100644 --- a/test/vinyl/upsert.test.lua +++ b/test/vinyl/upsert.test.lua @@ -325,3 +325,21 @@ s:upsert({100}, {{'+', 2, 100}}) s:select({100}, 'GE') s:drop() + +-- +-- Check that read iterator ignores a cached statement in case +-- the current key is updated while it yields on disk read. +-- +fiber = require('fiber') +s = box.schema.space.create('test', {engine = 'vinyl'}) +_ = s:create_index('pk') +s:replace{10, 10} +box.snapshot() +s:get(10) -- add [10, 10] to the cache +ch = fiber.channel(1) +test_run:cmd("setopt delimiter ';'") +_ = fiber.create(function() ch:put(s:select()) end) +s:upsert({10, 10}, {{'+', 2, 10}}) +test_run:cmd("setopt delimiter ''"); +ch:get() -- should see the UPSERT and return [10, 20] +s:drop() -- 2.11.0
next prev parent reply other threads:[~2019-03-26 15:50 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-03-26 15:50 [PATCH 0/6] vinyl: iterator cleanup Vladimir Davydov 2019-03-26 15:50 ` [PATCH 1/6] vinyl: txw iterator: fold eq check in seek method Vladimir Davydov 2019-03-28 14:25 ` [tarantool-patches] " Konstantin Osipov 2019-03-26 15:50 ` [PATCH 2/6] vinyl: cache " Vladimir Davydov 2019-03-28 14:27 ` [tarantool-patches] " Konstantin Osipov 2019-03-26 15:50 ` [PATCH 3/6] vinyl: cache iterator: consolidate curr_stmt updates Vladimir Davydov 2019-03-28 14:29 ` [tarantool-patches] " Konstantin Osipov 2019-03-28 14:47 ` Vladimir Davydov 2019-03-26 15:50 ` [PATCH 4/6] vinyl: run iterator: zap search_ended flag Vladimir Davydov 2019-03-28 14:35 ` [tarantool-patches] " Konstantin Osipov 2019-03-28 14:50 ` Vladimir Davydov 2019-03-26 15:50 ` [PATCH 5/6] vinyl: run iterator: refactor seek method Vladimir Davydov 2019-03-28 14:39 ` [tarantool-patches] " Konstantin Osipov 2019-03-28 14:58 ` Vladimir Davydov 2019-03-26 15:50 ` Vladimir Davydov [this message] 2019-03-28 14:47 ` [tarantool-patches] Re: [PATCH 6/6] vinyl: simplify read iterator restoration behavior Konstantin Osipov 2019-03-28 16:28 ` [PATCH 0/6] vinyl: iterator cleanup Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=208f29dcc3ca58f2acd4601195402c7dadbbb1fe.1553613748.git.vdavydov.dev@gmail.com \ --to=vdavydov.dev@gmail.com \ --cc=tarantool-patches@freelists.org \ --subject='Re: [PATCH 6/6] vinyl: simplify read iterator restoration behavior' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox