Tarantool development patches archive
 help / color / mirror / Atom feed
* Re: [Tarantool-patches] [PATCH] rfc: multi-directional iterators
@ 2020-04-24 10:10 Aleksandr Lyapunov
  2020-04-25  9:03 ` Konstantin Osipov
  0 siblings, 1 reply; 6+ messages in thread
From: Aleksandr Lyapunov @ 2020-04-24 10:10 UTC (permalink / raw)
  To: korablev; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 11129 bytes --]

> Part of #3243
> ---
> rfc in human-readable format:https://github.com/tarantool/tarantool/blob/np/gh-3243-multi-directional-iter-rfc/doc/rfc/3309-multi-directional-iterators.md
> Issue:https://github.com/tarantool/tarantool/issues/3243
>
>   doc/rfc/3309-multi-directional-iterators.md | 186 ++++++++++++++++++++++++++++
>   1 file changed, 186 insertions(+)
>   create mode 100644 doc/rfc/3309-multi-directional-iterators.md
>
> diff --git a/doc/rfc/3309-multi-directional-iterators.md b/doc/rfc/3309-multi-directional-iterators.md
> new file mode 100644
> index 000000000..4f8ca7c99
> --- /dev/null
> +++ b/doc/rfc/3309-multi-directional-iterators.md
> @@ -0,0 +1,186 @@
> +# Multi-directional iterators
> +
> +* **Status**: In progress
> +* **Start date**: 22-01-2020
> +* **Authors**: Nikita Pettik @korablev77korablev at tarantool.org  <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>
> +* **Issues**: [#3243](https://github.com/tarantool/tarantool/issues/3243)
> +
> +
> +## Background and motivation
> +
> +This RFC touches only Memtx engine and TREE index type (as the only available
> +in SQL and most common in user's practice). Multi-directional iterator is
> +an iterator which allows iterating through different key parts orders.
> +For instance: consider index `i` consisting of three key parts: `a`, `b` and `c`.
> +Creating and using casual iterator looks like:
> +```
> +i = box.space.T.index.i
> +i:select({1, 2, 3}, {iterator = 'EQ'}) -- returns all tuples which has
> +                                       -- fields a == 1, b == 2, c == 3.
> +```
> +It is OK to omit one or more key parts declaring key value to be searched. In
> +this case they are assumed to be nils:
> +`i:select({1}, {iterator = 'EQ'})` is the same as
> +`i:select({1, nil, nil}, {iterator = 'EQ'})`. So all tuples which has `a == 1`
> +are getting to the result set. More formally matching rule is following:
> +```
> +if (search-key-part[i] is nil)
> +{
> +  if (iterator is LT or GT) return FALSE
> +  return TRUE
What does that TRUE and FALSE mean?
BTW tree index works fine with partial keys and GT/LT iterators.
E.g. `i:select({1}, {iterator = 'GT'})` returns all tuples with `a > 1`.

> +}
> +```
> +
> +Another example:
> +`i:select({1, 1, 1}, {iterator = 'GE'})`
> +
> +Here all tuples with `a >= 1`, `b >= 1` and `c >= 1` are returned. But some users
That is not correct. Tuples and keys are compared lexicographically. 
Thus all tuples with
(`a = 1` AND `b = 1` AND `c >= 1') OR (`a = 1` AND `b > 1`) OR (`a > 1`) 
are the result.

> +may want to select tuples with `a >= 1`, `b >= 1` but `c < 1`. Or, alternatively,
> +somebody may be willing to get tuples ordered by `a` and `b` in ascending order
> +but by `c` in descending order: `i:select({}, {iterator = {'GE', 'GE', 'LE'})`.
Those are not alternatives but independent options. One may wish to select:
`SELECT * FROM t WHERE a > 1 AND b < 1 ORDER BY a DESC, b ASC`.
> +It is analogue of common SQL query `SELECT * FROM t ORDER BY a ASC, b ASC, c DESC`.
> +These requests are obviously impossible to fulfill with current indexes and
> +iterators implementations. This RFC suggests ways to resolve mentioned problem
> +in particular for memtx TREE indexes.
> +
> +## Implementation details
> +
> +TREE indexes in memtx engine are implemented as BPS-trees (see
> +`src/lib/salad/bps_tree.h` for details). Keys correspond to particular values
> +of key parts; data - to pointers to tuples. Hence, all data
> +are sorted by their key values due to tree structure. For this reason HASH
> +indexes have only GT and EQ (and ergo GE) iterators - data stored in a hash is
> +unordered. Tree interface itself provides several functions to operate on data.
> +Iteration process starts in `tree_iterator_start()` (which is called once as
> +`iterator->next()`): depending on its type iterator is positioned to the lower
> +or upper bound (via `memtx_tree_lower_bound()`) of range of values satisfying
> +search condition. In case key is not specified (i.e. empty), iterator is simply
> +set to the first or last element of tree. At this moment first element to be
> +returned (if any) is ready. To continue iterating `next` method of iterator
> +object is changed to one of `tree_iterator_next()`, `tree_iterator_prev()` or
> +their analogues for GE and LE iterators. Actually these functions fetch next
> +element from B-tree leaf block. If iterator points to the last element in the
> +block, it is set to the first element of the next block (leaf blocks are linked
> +into list); if there's no more blocks, iterator is invalidated and iteration
> +process is finished.
> +Taking into account this information let's review several approaches how to
> +implement multi-directional iterators.
> +
> +### Solution №1
> +
> +First solution doesn't involve any additional data structures so that it deals
> +with multi-directional iterators only using existing B-tree structure.
> +It fact, first key part can be used to locate first element as a candidate
> +in the range to be selected. To illustrate this point let's consider following
> +example:
> +
> +```
> +s:create_index('i', {parts = {{1, 'integer'}, {2, 'integer'}}})`
> +s:insert({1, 0})
> +s:insert({1, 0})
> +s:insert({1, 1})
> +s:insert({2, 0})
> +s:insert({2, 1})
> +i:select({}, {iterator = {'GE', 'LE'}})
> +```
> +
> +Result should be:
> +```
> +[1, 1]
> +[1, 0]
> +[1, 0]
> +[2, 1]
> +[2, 0]
> +```
> +Note that in case of casual GE iterator (i.e. {GE, GE} in terms of
> +multi-directional iterators) result is:
> +```
> +[1, 0]
> +[1, 0]
> +[1, 1]
> +[2, 0]
> +[2, 1]
> +```
> +As one can see, results are sorted in different orders by second key part,
> +but in the same order by first key part (not surprisingly). Assume first
> +element with first key part satisfying search condition is located: {1, 0}.
> +Then let's find out the first key part with different iterating order (in our
> +case it is second key part). Since order is different for that key part, it is
> +required to locate the first tuple with next first key part value: {2, 0}.
> +After that, auxiliary iterator is created and positioned to that tuple (see
Generally a stack of iterators is required for that.
> +schema below). Since order for the second key part is different, auxiliary
> +iterator moves towards main iterator.
> +
> +```
> +[1, 0], [1, 0], [1, 1], [2, 0] ... // tuples are arranged as in B-tree
> +^                      ^
> +|               <----- |
> +Main iterator         Aux. iterator
> +```
> +Note that auxiliary iterator is assumed to process all keys between its initial
> +position and main iterator position (since those keys are ordered using other
> +direction - sort of full scan). Auxiliary iterator is required for each key
> +part starting from that which direction is different from one of first key part.
> +So that iteration process is likely to be slow without any optimizations.
> +For EQ iterator type it is possible to simply skip those tuples which doesn't
> +satisfy equality condition. In turn, it results in necessity to extract part of
> +key value for all 'EQ' iterator key parts and compare it with reference key
> +value. This algorithm can be generalized for any number of key parts in index.
> +
> +Pros (+):
> + - it allows to specify any sets of key part iteration orders;
> + - in contrast to the second implementation, the first resulting tuple is
> +   returned way much faster (since there's no time overhead to built new tree);
> + - finally, there's almost no memory overhead.
> +
> +Cons (-):
> + - obviously the main drawback of this approach is time complexity -
> +   it doesn't seem to be way faster than simple full-scan (the more key parts
> +   with different iterating order are declared, the slower process will be).
Actually that's not so complex. As I can see the proposed iterator will scan
only those tuples that is about to output. There is also an optimization of
a tree traverse+search that has O(log(N)) complexity that would help
in case of SELECT with LIMIT.
> +
> +### Solution №2
> +
> +Since BPS tree is built without acknowledge in which order keys should be
> +placed, it is assumed that order is always ascending: keys in blocks are sorted
> +from smaller to bigger (left-to-right); comparison between keys is made by
> +tuple_compare_with_key() function. It makes given tree be unsuitable for
> +efficient iteration in different orders. On the other hand, it is possible to
> +build new *temporary in-memory* BPS-tree featuring correct key order. It seems
> +to be easy to achieve since keys order depends on result of comparator function.
> +Reverting result of comparison for key parts corresponding to opposite iteration
> +direction gives appropriate keys ordering in the tree. Note that not all data in
> +space is needed to be in tree (in case of non-empty search key); only sub-tree
> +making up lower or upper bound of first key part is required.
> +
> +Pros (+):
> + - any sets of key part iteration orders are allowed.
> +
> +Cons (-):
> + - first tuple to selected is probably returned with significant delay;
In the worst case it'll cost N*log(N), where N is the index size.
Note that such a request will delay ALL the requests for unpredictable 
timespan.
BTW, usually it's better to build a temporary array and sort it rather 
than building
a temporary tree for just one output.
But still I fear that solution is not an option.
> + - tree memory construction overhead (only during iteration routine).
> +
> +### Solution №3
> +
> +It extends solution №2 in sense it allows specifying sorting direction for
> +each part right in key def, that is during index creation. For instance:
> +
> +`s:create_index('i', {parts = {{1, 'integer', 'asc'}, {2, 'integer', 'desc'}}})`
Already implemented:
`i = s:create_index('test', {type='tree', parts={{1, 'uint', 
sort_order='desc'}}})`
> +
> +After that 'GT'/'LT' iterators for parts with declared 'desc' sorting order will
> +return reversed results of comparison, so only comparators are affected.
> +That's it (probably the simplest solution; what is more 'DESC' index is casual
> +SQL feature in other DBs).
> +
> +Pros (+):
> + - index search via 'desc' iterators is almost as fast as via casual
> +   iterators;
> + - this approach seems to be easy in implementation and resolves
> +   problem in SQL (since at the moment of ephemeral space creation it is allowed
> +   to set its PK key parts orders).
> +
> +Cons (-):
> + - 'desc' indexes are not versatile - user is unable to set different
> +    orders in iterator;
> + - order of iteration itself is immutable. As a result, for each different
> +   iteration order user has to create separate index which in turn consumes
> +   additional memory and time as any other index.
> -- 
> 2.15.1
My summary:

 1. `select({1, 1}, {iterator='GE'})` is not equivalent to `SELECT ..
    WHERE a>=1 AND b >= 1`.
 2. It is still possible to implement `iterator={'GE', 'GE'}` that will
    suitable for that SQL SELECT.
 3. It is also possible to implement `order={'ASC', 'DESC'}` option in
    select using the solution #1.


[-- Attachment #2: Type: text/html, Size: 12164 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH] rfc: multi-directional iterators
  2020-04-24 10:10 [Tarantool-patches] [PATCH] rfc: multi-directional iterators Aleksandr Lyapunov
@ 2020-04-25  9:03 ` Konstantin Osipov
  0 siblings, 0 replies; 6+ messages in thread
From: Konstantin Osipov @ 2020-04-25  9:03 UTC (permalink / raw)
  To: Aleksandr Lyapunov; +Cc: tarantool-patches

* Aleksandr Lyapunov <alyapunov@tarantool.org> [20/04/24 13:11]:
> > +may want to select tuples with `a >= 1`, `b >= 1` but `c < 1`. Or, alternatively,
> > +somebody may be willing to get tuples ordered by `a` and `b` in ascending order
> > +but by `c` in descending order: `i:select({}, {iterator = {'GE', 'GE', 'LE'})`.
> Those are not alternatives but independent options. One may wish to select:
> `SELECT * FROM t WHERE a > 1 AND b < 1 ORDER BY a DESC, b ASC`.

I agree. 

> My summary:
> 
> 1. `select({1, 1}, {iterator='GE'})` is not equivalent to `SELECT ..
>    WHERE a>=1 AND b >= 1`.
> 2. It is still possible to implement `iterator={'GE', 'GE'}` that will
>    suitable for that SQL SELECT.
> 3. It is also possible to implement `order={'ASC', 'DESC'}` option in
>    select using the solution #1.

I agree, but it's not urgent either. These queries are rare.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH] rfc: multi-directional iterators
  2020-01-26 20:54 Nikita Pettik
  2020-01-27 19:34 ` Konstantin Osipov
@ 2020-04-25  9:00 ` Konstantin Osipov
  1 sibling, 0 replies; 6+ messages in thread
From: Konstantin Osipov @ 2020-04-25  9:00 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy

* Nikita Pettik <korablev@tarantool.org> [20/01/26 23:58]:
> +Cons (-):
> + - obviously the main drawback of this approach is time complexity -
> +   it doesn't seem to be way faster than simple full-scan (the more key parts
> +   with different iterating order are declared, the slower process will be).

This is not true. Getting to the next distinct element in a tree is O(log(distance
between elements)) of the tree, not O(distance between elements).

Actually it's quite evident that for BPS it's best to use this
option, the main issue with the implementation is that
multi-directional iterators require more state (they need to
remember a stack of positions), than a single-directional one (it
only has one position, the current one).


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH] rfc: multi-directional iterators
  2020-01-27 19:34 ` Konstantin Osipov
@ 2020-01-27 19:38   ` Konstantin Osipov
  0 siblings, 0 replies; 6+ messages in thread
From: Konstantin Osipov @ 2020-01-27 19:38 UTC (permalink / raw)
  To: Nikita Pettik, tarantool-patches, v.shpilevoy

* Konstantin Osipov <kostja.osipov@gmail.com> [20/01/27 22:34]:
> Locating the upper bound is O(log(N)), not O(N).
> 
> C++17 added map::equal_range, bps could provide a similar API
> 
> http://www.cplusplus.com/reference/map/map/equal_range/

the right link is http://www.cplusplus.com/reference/map/multimap/equal_range/

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH] rfc: multi-directional iterators
  2020-01-26 20:54 Nikita Pettik
@ 2020-01-27 19:34 ` Konstantin Osipov
  2020-01-27 19:38   ` Konstantin Osipov
  2020-04-25  9:00 ` Konstantin Osipov
  1 sibling, 1 reply; 6+ messages in thread
From: Konstantin Osipov @ 2020-01-27 19:34 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy

* Nikita Pettik <korablev@tarantool.org> [20/01/26 23:58]:

Please choose #1


> +As one can see, results are sorted in different orders by second key part,
> +but in the same order by first key part (not surprisingly). Assume first
> +element with first key part satisfying search condition is located: {1, 0}.
> +Then let's find out the first key part with different iterating order (in our
> +case it is second key part). Since order is different for that key part, it is
> +required to locate the first tuple with next first key part value: {2, 0}.
> +After that, auxiliary iterator is created and positioned to that tuple (see
> +schema below). Since order for the second key part is different, auxiliary
> +iterator moves towards main iterator.
> +
> +```
> +[1, 0], [1, 0], [1, 1], [2, 0] ... // tuples are arranged as in B-tree
> +^                      ^
> +|               <----- |
> +Main iterator         Aux. iterator
> +```
> +Note that auxiliary iterator is assumed to process all keys between its initial
> +position and main iterator position (since those keys are ordered using other
> +direction - sort of full scan). Auxiliary iterator is required for each key
> +part starting from that which direction is different from one of first key part.
> +So that iteration process is likely to be slow without any optimizations.
> +For EQ iterator type it is possible to simply skip those tuples which doesn't
> +satisfy equality condition. In turn, it results in necessity to extract part of
> +key value for all 'EQ' iterator key parts and compare it with reference key
> +value. This algorithm can be generalized for any number of key parts in index.  
> +
> +Pros (+):
> + - it allows to specify any sets of key part iteration orders;
> + - in contrast to the second implementation, the first resulting tuple is
> +   returned way much faster (since there's no time overhead to built new tree);
> + - finally, there's almost no memory overhead.  
> +
> +Cons (-):
> + - obviously the main drawback of this approach is time complexity -
> +   it doesn't seem to be way faster than simple full-scan (the more key parts
> +   with different iterating order are declared, the slower process will be).

Locating the upper bound is O(log(N)), not O(N).

C++17 added map::equal_range, bps could provide a similar API

http://www.cplusplus.com/reference/map/map/equal_range/

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Tarantool-patches] [PATCH] rfc: multi-directional iterators
@ 2020-01-26 20:54 Nikita Pettik
  2020-01-27 19:34 ` Konstantin Osipov
  2020-04-25  9:00 ` Konstantin Osipov
  0 siblings, 2 replies; 6+ messages in thread
From: Nikita Pettik @ 2020-01-26 20:54 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

Part of #3243
---
rfc in human-readable format: https://github.com/tarantool/tarantool/blob/np/gh-3243-multi-directional-iter-rfc/doc/rfc/3309-multi-directional-iterators.md
Issue: https://github.com/tarantool/tarantool/issues/3243

 doc/rfc/3309-multi-directional-iterators.md | 186 ++++++++++++++++++++++++++++
 1 file changed, 186 insertions(+)
 create mode 100644 doc/rfc/3309-multi-directional-iterators.md

diff --git a/doc/rfc/3309-multi-directional-iterators.md b/doc/rfc/3309-multi-directional-iterators.md
new file mode 100644
index 000000000..4f8ca7c99
--- /dev/null
+++ b/doc/rfc/3309-multi-directional-iterators.md
@@ -0,0 +1,186 @@
+# Multi-directional iterators
+
+* **Status**: In progress
+* **Start date**: 22-01-2020
+* **Authors**: Nikita Pettik @korablev77 korablev@tarantool.org
+* **Issues**: [#3243](https://github.com/tarantool/tarantool/issues/3243)
+
+
+## Background and motivation
+
+This RFC touches only Memtx engine and TREE index type (as the only available
+in SQL and most common in user's practice). Multi-directional iterator is
+an iterator which allows iterating through different key parts orders.
+For instance: consider index `i` consisting of three key parts: `a`, `b` and `c`.
+Creating and using casual iterator looks like:
+```
+i = box.space.T.index.i
+i:select({1, 2, 3}, {iterator = 'EQ'}) -- returns all tuples which has
+                                       -- fields a == 1, b == 2, c == 3.
+```
+It is OK to omit one or more key parts declaring key value to be searched. In
+this case they are assumed to be nils:
+`i:select({1}, {iterator = 'EQ'})` is the same as
+`i:select({1, nil, nil}, {iterator = 'EQ'})`. So all tuples which has `a == 1`
+are getting to the result set. More formally matching rule is following:
+```
+if (search-key-part[i] is nil)
+{
+  if (iterator is LT or GT) return FALSE
+  return TRUE
+}
+```
+
+Another example:
+`i:select({1, 1, 1}, {iterator = 'GE'})`
+
+Here all tuples with `a >= 1`, `b >= 1` and `c >= 1` are returned. But some users
+may want to select tuples with `a >= 1`, `b >= 1` but `c < 1`. Or, alternatively,
+somebody may be willing to get tuples ordered by `a` and `b` in ascending order
+but by `c` in descending order: `i:select({}, {iterator = {'GE', 'GE', 'LE'})`.
+It is analogue of common SQL query `SELECT * FROM t ORDER BY a ASC, b ASC, c DESC`.
+These requests are obviously impossible to fulfill with current indexes and
+iterators implementations. This RFC suggests ways to resolve mentioned problem
+in particular for memtx TREE indexes.
+
+## Implementation details
+
+TREE indexes in memtx engine are implemented as BPS-trees (see
+`src/lib/salad/bps_tree.h` for details). Keys correspond to particular values
+of key parts; data - to pointers to tuples. Hence, all data
+are sorted by their key values due to tree structure. For this reason HASH
+indexes have only GT and EQ (and ergo GE) iterators - data stored in a hash is
+unordered. Tree interface itself provides several functions to operate on data. 
+Iteration process starts in `tree_iterator_start()` (which is called once as
+`iterator->next()`): depending on its type iterator is positioned to the lower
+or upper bound (via `memtx_tree_lower_bound()`) of range of values satisfying
+search condition. In case key is not specified (i.e. empty), iterator is simply
+set to the first or last element of tree. At this moment first element to be
+returned (if any) is ready. To continue iterating `next` method of iterator
+object is changed to one of `tree_iterator_next()`, `tree_iterator_prev()` or
+their analogues for GE and LE iterators. Actually these functions fetch next
+element from B-tree leaf block. If iterator points to the last element in the
+block, it is set to the first element of the next block (leaf blocks are linked
+into list); if there's no more blocks, iterator is invalidated and iteration
+process is finished.  
+Taking into account this information let's review several approaches how to
+implement multi-directional iterators.
+
+### Solution №1
+
+First solution doesn't involve any additional data structures so that it deals
+with multi-directional iterators only using existing B-tree structure.  
+It fact, first key part can be used to locate first element as a candidate
+in the range to be selected. To illustrate this point let's consider following
+example:
+
+```
+s:create_index('i', {parts = {{1, 'integer'}, {2, 'integer'}}})`
+s:insert({1, 0})
+s:insert({1, 0})
+s:insert({1, 1})
+s:insert({2, 0})
+s:insert({2, 1})
+i:select({}, {iterator = {'GE', 'LE'}})
+```
+
+Result should be:
+```
+[1, 1]
+[1, 0]
+[1, 0]
+[2, 1]
+[2, 0]
+```
+Note that in case of casual GE iterator (i.e. {GE, GE} in terms of
+multi-directional iterators) result is:
+```
+[1, 0]
+[1, 0]
+[1, 1]
+[2, 0]
+[2, 1]
+```
+As one can see, results are sorted in different orders by second key part,
+but in the same order by first key part (not surprisingly). Assume first
+element with first key part satisfying search condition is located: {1, 0}.
+Then let's find out the first key part with different iterating order (in our
+case it is second key part). Since order is different for that key part, it is
+required to locate the first tuple with next first key part value: {2, 0}.
+After that, auxiliary iterator is created and positioned to that tuple (see
+schema below). Since order for the second key part is different, auxiliary
+iterator moves towards main iterator.
+
+```
+[1, 0], [1, 0], [1, 1], [2, 0] ... // tuples are arranged as in B-tree
+^                      ^
+|               <----- |
+Main iterator         Aux. iterator
+```
+Note that auxiliary iterator is assumed to process all keys between its initial
+position and main iterator position (since those keys are ordered using other
+direction - sort of full scan). Auxiliary iterator is required for each key
+part starting from that which direction is different from one of first key part.
+So that iteration process is likely to be slow without any optimizations.
+For EQ iterator type it is possible to simply skip those tuples which doesn't
+satisfy equality condition. In turn, it results in necessity to extract part of
+key value for all 'EQ' iterator key parts and compare it with reference key
+value. This algorithm can be generalized for any number of key parts in index.  
+
+Pros (+):
+ - it allows to specify any sets of key part iteration orders;
+ - in contrast to the second implementation, the first resulting tuple is
+   returned way much faster (since there's no time overhead to built new tree);
+ - finally, there's almost no memory overhead.  
+
+Cons (-):
+ - obviously the main drawback of this approach is time complexity -
+   it doesn't seem to be way faster than simple full-scan (the more key parts
+   with different iterating order are declared, the slower process will be).
+
+### Solution №2
+
+Since BPS tree is built without acknowledge in which order keys should be
+placed, it is assumed that order is always ascending: keys in blocks are sorted
+from smaller to bigger (left-to-right); comparison between keys is made by
+tuple_compare_with_key() function. It makes given tree be unsuitable for
+efficient iteration in different orders. On the other hand, it is possible to
+build new *temporary in-memory* BPS-tree featuring correct key order. It seems
+to be easy to achieve since keys order depends on result of comparator function.
+Reverting result of comparison for key parts corresponding to opposite iteration
+direction gives appropriate keys ordering in the tree. Note that not all data in
+space is needed to be in tree (in case of non-empty search key); only sub-tree
+making up lower or upper bound of first key part is required.
+
+Pros (+):
+ - any sets of key part iteration orders are allowed.  
+
+Cons (-):
+ - first tuple to selected is probably returned with significant delay;
+ - tree memory construction overhead (only during iteration routine).
+
+### Solution №3
+
+It extends solution №2 in sense it allows specifying sorting direction for
+each part right in key def, that is during index creation. For instance:
+
+`s:create_index('i', {parts = {{1, 'integer', 'asc'}, {2, 'integer', 'desc'}}})`
+
+After that 'GT'/'LT' iterators for parts with declared 'desc' sorting order will
+return reversed results of comparison, so only comparators are affected.
+That's it (probably the simplest solution; what is more 'DESC' index is casual
+SQL feature in other DBs).  
+
+Pros (+):
+ - index search via 'desc' iterators is almost as fast as via casual
+   iterators;
+ - this approach seems to be easy in implementation and resolves
+   problem in SQL (since at the moment of ephemeral space creation it is allowed
+   to set its PK key parts orders).  
+
+Cons (-):
+ - 'desc' indexes are not versatile - user is unable to set different
+    orders in iterator;
+ - order of iteration itself is immutable. As a result, for each different
+   iteration order user has to create separate index which in turn consumes
+   additional memory and time as any other index.
-- 
2.15.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-25  9:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-24 10:10 [Tarantool-patches] [PATCH] rfc: multi-directional iterators Aleksandr Lyapunov
2020-04-25  9:03 ` Konstantin Osipov
  -- strict thread matches above, loose matches on Subject: below --
2020-01-26 20:54 Nikita Pettik
2020-01-27 19:34 ` Konstantin Osipov
2020-01-27 19:38   ` Konstantin Osipov
2020-04-25  9:00 ` Konstantin Osipov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox