[RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version

Tue Jul 10 19:19:26 MSK 2018

* Vladimir Davydov <vdavydov.dev at gmail.com> [18/07/08 22:52]:
> Currently, vy_point_lookup(), in contrast to vy_read_iterator, doesn't
> rescan the memory level after reading disk, so if the caller doesn't
> track the key before calling this function, the caller won't be sent to
> a read view in case the key gets updated during yield and hence will
> be returned a stale tuple. This is OK now, because we always track the
> key before calling vy_point_lookup(), either in the primary or in a
> secondary index. However, for #2129 we need it to always return the
> latest tuple version, no matter if the key is tracked or not.
> 
> The point is in the scope of #2129 we won't write DELETE statements to
> secondary indexes corresponding to a tuple replaced in the primary
> index. Instead after reading a tuple from a secondary index we will
> check whether it matches the tuple corresponding to it in the primary
> index: if it is not, it means that the tuple read from the secondary
> index was overwritten and should be skipped. E.g. suppose we have the
> primary index over the first field and a secondary index over the second
> field and the following statements in the space:
> 
>   REPLACE{1, 10}
>   REPLACE{1, 20}
> 
> Then reading {10} from the secondary index will return REPLACE{1, 10}, but
> lookup of {1} in the primary index will return REPLACE{1, 20} which
> doesn't match REPLACE{1, 10} read from the secondary index hence the
> latter was overwritten and should be skipped.
> 
> The problem is in the example above we don't want to track key {1} in
> the primary index before lookup, because we don't actually read its
> value. So for the check to work correctly, we need the point lookup to
> guarantee that the returned tuple is always the newest one. It's fairly
> easy to do - we just need to rescan the memory level after yielding on
> disk if its version changed.

Thank you for the explanation. I haven't read the patch itself
yet. But aren't you complicating things more than necessary? All
we need to do when looking up a match in the primary index is to
compare the match LSN and the secondary index tuple LSN. If there
is a mismatch, then we need to skip the secondary key tuple: it's
garbage. The mismatch does not need to take into account new
tuples which appeared during yield, since a mismatch can not
appear during yield.

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov