[Tarantool-patches] [PATCH] rfc: multi-directional iterators

Konstantin Osipov kostja.osipov at gmail.com
Mon Jan 27 22:34:14 MSK 2020


* Nikita Pettik <korablev at tarantool.org> [20/01/26 23:58]:

Please choose #1


> +As one can see, results are sorted in different orders by second key part,
> +but in the same order by first key part (not surprisingly). Assume first
> +element with first key part satisfying search condition is located: {1, 0}.
> +Then let's find out the first key part with different iterating order (in our
> +case it is second key part). Since order is different for that key part, it is
> +required to locate the first tuple with next first key part value: {2, 0}.
> +After that, auxiliary iterator is created and positioned to that tuple (see
> +schema below). Since order for the second key part is different, auxiliary
> +iterator moves towards main iterator.
> +
> +```
> +[1, 0], [1, 0], [1, 1], [2, 0] ... // tuples are arranged as in B-tree
> +^                      ^
> +|               <----- |
> +Main iterator         Aux. iterator
> +```
> +Note that auxiliary iterator is assumed to process all keys between its initial
> +position and main iterator position (since those keys are ordered using other
> +direction - sort of full scan). Auxiliary iterator is required for each key
> +part starting from that which direction is different from one of first key part.
> +So that iteration process is likely to be slow without any optimizations.
> +For EQ iterator type it is possible to simply skip those tuples which doesn't
> +satisfy equality condition. In turn, it results in necessity to extract part of
> +key value for all 'EQ' iterator key parts and compare it with reference key
> +value. This algorithm can be generalized for any number of key parts in index.  
> +
> +Pros (+):
> + - it allows to specify any sets of key part iteration orders;
> + - in contrast to the second implementation, the first resulting tuple is
> +   returned way much faster (since there's no time overhead to built new tree);
> + - finally, there's almost no memory overhead.  
> +
> +Cons (-):
> + - obviously the main drawback of this approach is time complexity -
> +   it doesn't seem to be way faster than simple full-scan (the more key parts
> +   with different iterating order are declared, the slower process will be).

Locating the upper bound is O(log(N)), not O(N).

C++17 added map::equal_range, bps could provide a similar API

http://www.cplusplus.com/reference/map/map/equal_range/

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list