From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 6D1E546970E for ; Mon, 27 Jan 2020 22:34:16 +0300 (MSK) Received: by mail-lj1-f180.google.com with SMTP id o11so12148327ljc.6 for ; Mon, 27 Jan 2020 11:34:16 -0800 (PST) Date: Mon, 27 Jan 2020 22:34:14 +0300 From: Konstantin Osipov Message-ID: <20200127193414.GA19132@atlas> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH] rfc: multi-directional iterators List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org * Nikita Pettik [20/01/26 23:58]: Please choose #1 > +As one can see, results are sorted in different orders by second key part, > +but in the same order by first key part (not surprisingly). Assume first > +element with first key part satisfying search condition is located: {1, 0}. > +Then let's find out the first key part with different iterating order (in our > +case it is second key part). Since order is different for that key part, it is > +required to locate the first tuple with next first key part value: {2, 0}. > +After that, auxiliary iterator is created and positioned to that tuple (see > +schema below). Since order for the second key part is different, auxiliary > +iterator moves towards main iterator. > + > +``` > +[1, 0], [1, 0], [1, 1], [2, 0] ... // tuples are arranged as in B-tree > +^ ^ > +| <----- | > +Main iterator Aux. iterator > +``` > +Note that auxiliary iterator is assumed to process all keys between its initial > +position and main iterator position (since those keys are ordered using other > +direction - sort of full scan). Auxiliary iterator is required for each key > +part starting from that which direction is different from one of first key part. > +So that iteration process is likely to be slow without any optimizations. > +For EQ iterator type it is possible to simply skip those tuples which doesn't > +satisfy equality condition. In turn, it results in necessity to extract part of > +key value for all 'EQ' iterator key parts and compare it with reference key > +value. This algorithm can be generalized for any number of key parts in index. > + > +Pros (+): > + - it allows to specify any sets of key part iteration orders; > + - in contrast to the second implementation, the first resulting tuple is > + returned way much faster (since there's no time overhead to built new tree); > + - finally, there's almost no memory overhead. > + > +Cons (-): > + - obviously the main drawback of this approach is time complexity - > + it doesn't seem to be way faster than simple full-scan (the more key parts > + with different iterating order are declared, the slower process will be). Locating the upper bound is O(log(N)), not O(N). C++17 added map::equal_range, bps could provide a similar API http://www.cplusplus.com/reference/map/map/equal_range/ -- Konstantin Osipov, Moscow, Russia