* [tarantool-patches] Re: [PATCH 1/1] vinyl: fix crash in vinyl_iterator_secondary_next
2018-05-11 23:24 [tarantool-patches] [PATCH 1/1] vinyl: fix crash in vinyl_iterator_secondary_next Vladislav Shpilevoy
@ 2018-05-11 23:52 ` Konstantin Osipov
2018-05-12 0:30 ` [tarantool-patches] " Vladimir Davydov
1 sibling, 0 replies; 3+ messages in thread
From: Konstantin Osipov @ 2018-05-11 23:52 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tarantool-patches
* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [18/05/12 02:27]:
> Vinyl_iterator_secondary_next is a method to iterate over a
> secondary index. It gets a tuple from the secondary index and
> looks up its full version in the primary index.
>
> But during lookup the tuple can be deleted from the former, and
> the iterator must take it in the hopper.
>
> Closes #3393
First, is 1.9 affected? My guess is it isn't and this
is a regression from Vova's refactoring for online after.
The key to the problem seems to be in the exact timing when the
key is added to the transaction conflict manager by vy_tx_track().
But in this case your fix seems to be rather crude, the problem
should be solved by harmonizing the timing of tracking and lookup
in the primary index.
I got in touch with Vova and he confirmed that the read should be
tracked before a yield.
Let's wait for his root cause analysis, he promised to take a look
at the test case/patch.
> src/box/vinyl.c | 8 ++++-
> test/vinyl/errinj.result | 73 ++++++++++++++++++++++++++++++++++++++++++++++
> test/vinyl/errinj.test.lua | 26 +++++++++++++++++
> 3 files changed, 106 insertions(+), 1 deletion(-)
>
> diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> index ff4c283..460b9b5 100644
> --- a/src/box/vinyl.c
> +++ b/src/box/vinyl.c
> @@ -3853,7 +3853,7 @@ vinyl_iterator_secondary_next(struct iterator *base, struct tuple **ret)
> goto fail;
> }
>
> -
> +next_tuple:
> if (vy_read_iterator_next(&it->iterator, &tuple) != 0)
> goto fail;
>
> @@ -3879,6 +3879,12 @@ vinyl_iterator_secondary_next(struct iterator *base, struct tuple **ret)
> if (vy_point_lookup(it->lsm->pk, it->tx, vy_tx_read_view(it->tx),
> tuple, &tuple) != 0)
> goto fail;
> + /*
> + * The tuple can be removed by the same transaction during
> + * primary index lookup.
> + */
> + if (tuple == NULL)
> + goto next_tuple;
> *ret = tuple_bless(tuple);
> tuple_unref(tuple);
> if (*ret != NULL)
> diff --git a/test/vinyl/errinj.result b/test/vinyl/errinj.result
> index fd21f7b..b63d01c 100644
> --- a/test/vinyl/errinj.result
> +++ b/test/vinyl/errinj.result
> @@ -1395,3 +1395,76 @@ s:count() -- 200
> s:drop()
> ---
> ...
> +--
> +-- gh-3393: vinyl secondary index iterator must take in the hopper
> +-- that a tuple can disappear from the primary index after the
> +-- secondary index lookup.
> +--
> +s = box.schema.create_space('test', {engine = 'vinyl'})
> +---
> +...
> +pk = s:create_index('pk')
> +---
> +...
> +sk = s:create_index('sk', {parts = {{2, 'unsigned'}, {1, 'unsigned'}}})
> +---
> +...
> +s:replace{1, 2}
> +---
> +- [1, 2]
> +...
> +s:replace{3, 4}
> +---
> +- [3, 4]
> +...
> +box.snapshot()
> +---
> +- ok
> +...
> +ret = nil
> +---
> +...
> +function iterate(i) ret = i.next() end
> +---
> +...
> +box.begin()
> +---
> +...
> +s:replace{5, 6} -- Start a transaction in the engine.
> +---
> +- [5, 6]
> +...
> +iter = create_iterator(sk, {4}, {iterator = 'GE'})
> +---
> +...
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", true)
> +---
> +- ok
> +...
> +f = fiber.create(iterate, iter)
> +---
> +...
> +s:delete{3}
> +---
> +...
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", false)
> +---
> +- ok
> +...
> +while not ret do fiber.sleep(0.1) end
> +---
> +...
> +ret
> +---
> +- [5, 6]
> +...
> +iter.iterate_over()
> +---
> +- []
> +...
> +box.commit()
> +---
> +...
> +s:drop()
> +---
> +...
> diff --git a/test/vinyl/errinj.test.lua b/test/vinyl/errinj.test.lua
> index 64d04c6..c551de1 100644
> --- a/test/vinyl/errinj.test.lua
> +++ b/test/vinyl/errinj.test.lua
> @@ -548,3 +548,29 @@ ch:get()
>
> s:count() -- 200
> s:drop()
> +
> +--
> +-- gh-3393: vinyl secondary index iterator must take in the hopper
> +-- that a tuple can disappear from the primary index after the
> +-- secondary index lookup.
> +--
> +s = box.schema.create_space('test', {engine = 'vinyl'})
> +pk = s:create_index('pk')
> +sk = s:create_index('sk', {parts = {{2, 'unsigned'}, {1, 'unsigned'}}})
> +s:replace{1, 2}
> +s:replace{3, 4}
> +box.snapshot()
> +ret = nil
> +function iterate(i) ret = i.next() end
> +box.begin()
> +s:replace{5, 6} -- Start a transaction in the engine.
> +iter = create_iterator(sk, {4}, {iterator = 'GE'})
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", true)
> +f = fiber.create(iterate, iter)
> +s:delete{3}
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", false)
> +while not ret do fiber.sleep(0.1) end
> +ret
> +iter.iterate_over()
> +box.commit()
> +s:drop()
>
> --
> To stop receiving notification emails like this one, please contact
> the administrator of this repository.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
> Vinyl_iterator_secondary_next is a method to iterate over a
> secondary index. It gets a tuple from the secondary index and
> looks up its full version in the primary index.
>
> But during lookup the tuple can be deleted from the former, and
> the iterator must take it in the hopper.
>
> Closes #3393
> ---
> Branch: https://github.com/tarantool/tarantool/tree/gh-3393-vinyl-iterator-crash
> Issue: https://github.com/tarantool/tarantool/issues/3393
>
> src/box/vinyl.c | 8 ++++-
> test/vinyl/errinj.result | 73 ++++++++++++++++++++++++++++++++++++++++++++++
> test/vinyl/errinj.test.lua | 26 +++++++++++++++++
> 3 files changed, 106 insertions(+), 1 deletion(-)
>
> diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> index ff4c28314..460b9b5e7 100644
> --- a/src/box/vinyl.c
> +++ b/src/box/vinyl.c
> @@ -3853,7 +3853,7 @@ vinyl_iterator_secondary_next(struct iterator *base, struct tuple **ret)
> goto fail;
> }
>
> -
> +next_tuple:
> if (vy_read_iterator_next(&it->iterator, &tuple) != 0)
> goto fail;
>
> @@ -3879,6 +3879,12 @@ vinyl_iterator_secondary_next(struct iterator *base, struct tuple **ret)
> if (vy_point_lookup(it->lsm->pk, it->tx, vy_tx_read_view(it->tx),
> tuple, &tuple) != 0)
> goto fail;
> + /*
> + * The tuple can be removed by the same transaction during
> + * primary index lookup.
> + */
> + if (tuple == NULL)
> + goto next_tuple;
> *ret = tuple_bless(tuple);
> tuple_unref(tuple);
> if (*ret != NULL)
> diff --git a/test/vinyl/errinj.result b/test/vinyl/errinj.result
> index fd21f7bb4..b63d01c18 100644
> --- a/test/vinyl/errinj.result
> +++ b/test/vinyl/errinj.result
> @@ -1395,3 +1395,76 @@ s:count() -- 200
> s:drop()
> ---
> ...
> +--
> +-- gh-3393: vinyl secondary index iterator must take in the hopper
> +-- that a tuple can disappear from the primary index after the
> +-- secondary index lookup.
> +--
> +s = box.schema.create_space('test', {engine = 'vinyl'})
> +---
> +...
> +pk = s:create_index('pk')
> +---
> +...
> +sk = s:create_index('sk', {parts = {{2, 'unsigned'}, {1, 'unsigned'}}})
> +---
> +...
> +s:replace{1, 2}
> +---
> +- [1, 2]
> +...
> +s:replace{3, 4}
> +---
> +- [3, 4]
> +...
> +box.snapshot()
> +---
> +- ok
> +...
> +ret = nil
> +---
> +...
> +function iterate(i) ret = i.next() end
> +---
> +...
> +box.begin()
> +---
> +...
> +s:replace{5, 6} -- Start a transaction in the engine.
> +---
> +- [5, 6]
> +...
> +iter = create_iterator(sk, {4}, {iterator = 'GE'})
> +---
> +...
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", true)
> +---
> +- ok
> +...
> +f = fiber.create(iterate, iter)
> +---
> +...
> +s:delete{3}
> +---
> +...
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", false)
> +---
> +- ok
> +...
> +while not ret do fiber.sleep(0.1) end
> +---
> +...
> +ret
> +---
> +- [5, 6]
> +...
> +iter.iterate_over()
> +---
> +- []
> +...
> +box.commit()
> +---
> +...
> +s:drop()
> +---
> +...
> diff --git a/test/vinyl/errinj.test.lua b/test/vinyl/errinj.test.lua
> index 64d04c62d..c551de1b3 100644
> --- a/test/vinyl/errinj.test.lua
> +++ b/test/vinyl/errinj.test.lua
> @@ -548,3 +548,29 @@ ch:get()
>
> s:count() -- 200
> s:drop()
> +
> +--
> +-- gh-3393: vinyl secondary index iterator must take in the hopper
> +-- that a tuple can disappear from the primary index after the
> +-- secondary index lookup.
> +--
> +s = box.schema.create_space('test', {engine = 'vinyl'})
> +pk = s:create_index('pk')
> +sk = s:create_index('sk', {parts = {{2, 'unsigned'}, {1, 'unsigned'}}})
> +s:replace{1, 2}
> +s:replace{3, 4}
> +box.snapshot()
> +ret = nil
> +function iterate(i) ret = i.next() end
> +box.begin()
> +s:replace{5, 6} -- Start a transaction in the engine.
> +iter = create_iterator(sk, {4}, {iterator = 'GE'})
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", true)
> +f = fiber.create(iterate, iter)
> +s:delete{3}
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", false)
> +while not ret do fiber.sleep(0.1) end
> +ret
> +iter.iterate_over()
> +box.commit()
> +s:drop()
> --
> 2.15.1 (Apple Git-101)
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [tarantool-patches] [PATCH 1/1] vinyl: fix crash in vinyl_iterator_secondary_next
2018-05-11 23:24 [tarantool-patches] [PATCH 1/1] vinyl: fix crash in vinyl_iterator_secondary_next Vladislav Shpilevoy
2018-05-11 23:52 ` [tarantool-patches] " Konstantin Osipov
@ 2018-05-12 0:30 ` Vladimir Davydov
1 sibling, 0 replies; 3+ messages in thread
From: Vladimir Davydov @ 2018-05-12 0:30 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tarantool-patches, kostja
On Sat, May 12, 2018 at 02:24:41AM +0300, Vladislav Shpilevoy wrote:
> +
> +--
> +-- gh-3393: vinyl secondary index iterator must take in the hopper
> +-- that a tuple can disappear from the primary index after the
> +-- secondary index lookup.
> +--
> +s = box.schema.create_space('test', {engine = 'vinyl'})
> +pk = s:create_index('pk')
> +sk = s:create_index('sk', {parts = {{2, 'unsigned'}, {1, 'unsigned'}}})
> +s:replace{1, 2}
> +s:replace{3, 4}
> +box.snapshot()
> +ret = nil
> +function iterate(i) ret = i.next() end
> +box.begin()
> +s:replace{5, 6} -- Start a transaction in the engine.
> +iter = create_iterator(sk, {4}, {iterator = 'GE'})
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", true)
> +f = fiber.create(iterate, iter)
> +s:delete{3}
> +errinj.set("ERRINJ_VY_DELAY_PK_LOOKUP", false)
> +while not ret do fiber.sleep(0.1) end
Here's what happens here:
1. You create an iterator in a transaction and pass it to a new fiber.
Let us denote the current fiber as fiber A and the new fiber as
fiber B.
2. Fiber B uses the iterator to read the secondary key {4}, then it
yields right before fetching the tuple {3, 4} from the primary key.
3. In the mid time fiber A deletes tuple {3, 4} by writing DELETE{3, 4}
to the transaction write set.
4. Fiber B continues execution and since it's bound to transaction
corresponding to fiber A, it finds DELETE{3, 4} in the transaction
write set and crashes.
I don't think this has anything to do with the original bug you're
pursuing. The use case is quite unnatural obviously. I guess we should
simply prohibit it (i.e. abort read iterator if it is passed to another
fiber). I surmise the crash our customer is experiencing is caused by a
bug in read iterator restore procedure, which was probably brought about
by one of my recent patches.
That said, NAK for this patch.
> +ret
> +iter.iterate_over()
> +box.commit()
> +s:drop()
^ permalink raw reply [flat|nested] 3+ messages in thread