From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [tarantool-patches] Re: [PATCH v1 1/1] rfc: describe a Tarantool JSON indexes References: <7192ba6c28bf9cd637f7e1e5263bbf9771cc6f44.1532603654.git.kshcherbatov@tarantool.org> <20180727151013.goyfa4uuf7nl7nou@esperanza> <3a467b72-cff9-cc19-7dec-358b5a020e62@tarantool.org> From: Vladislav Shpilevoy Message-ID: <275eccb5-b77d-a2ed-7ee2-c002a28cd096@tarantool.org> Date: Mon, 30 Jul 2018 21:46:28 +0300 MIME-Version: 1.0 In-Reply-To: <3a467b72-cff9-cc19-7dec-358b5a020e62@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit To: Kirill Shcherbatov , tarantool-patches@freelists.org, Vladimir Davydov List-ID: On 30/07/2018 19:14, Kirill Shcherbatov wrote: > Hi! Thank you for review. I've accounted all your suggestions. > > On 30.07.2018 16:45, Vladislav Shpilevoy wrote: >> Hi! Thanks for the new version! See 12 comments below. >> >> Vova, please, look at my comments and say what do you think. > We have verbally discuss all of this with Vova. It is cool, but I do not know what you have discussed. I guess such things should be done either via email, or verbally with all participants. > >> 2. As I remember from verbal discussion, we've decided to do not store >> offsets for intermediate nodes. It is too expensive. You actually purpose >> to store an offset for each tuple field, even non-indexed. In such a case >> the field_map would become bigger than the tuple payload. Field_map is >> very expensive storage and should not store non-needed offsets. So you should >> not have an offset on [name], on [birthday]. Only on [first] and [last]. > I've already answered with previous letter that this is not slot_offset that allocated as a part of tuple. > "off. cache" is only implementation-specific detail that allows start parsing with most relevant offset > on tree traversal. I still do not clearly understand what are you talking about. You can have different offsets in the same path: 1) [1][2][3].field and 2) [1][2][3]["field"]. Here 'field' has offset 10 in the first case and 11 in the second one. If you want to use prefix length in off.cache on comparison to walk the tuple_field trees along with the path in key_part on mismatch of cache versions, then you should explain more clear how do you want to use off.cache. Lets suppose you have a key_part with a JSON path and a trees array. To determine into which tree you first go to find offset_slot, you should parse first path part. Same for each next part - you should parse it to go down the tree. So you just do not know into which tuple_field you go until the next part of the path is parsed. And how does tuple_field.off_cache help here? And what will you do when you met a format which does not have an offset for the needed field? For example, I have created an index, inserted multiple tuples, then created another index. The format is changed, but the old tuples have the old format that does not have an offset to the parts of the new index. >> 8. This is the array of trees. It is not array + tree in a separate >> field. You have array of trees where i-th tree describes format of >> the i-th field and its internals. Some of tree-nodes have offsets >> and some are just to validate the format. Do not forget that these >> trees are going to be used for space:format validation. Offset_slot >> is a part of tuple_field, even now, and is filled optionally if the >> field is a part of an index. > struct tuple_format { > ... > > /** Epoch of tuple format. */ > uint32_t epoch; > /** Array of data_path trees built for indexes. */ > TREE index_tree[0]; As I said, it is not special index tree. It can describe a space format with tens of fields among which only one is indexed. Index_tree is not correct name. It is tuple_field array as it is now, where each field is just a struct tuple_field. And struct tuple_field can contain more tuple_fields inside either as an array (if the field type is array) or inside a tree/hash if it is map. > }; > ``` > > Hm, perhaps it is the time to include you and Vova to RFC authors? Don't know does it matter. Up to you %) If you want to be a single author, you are welcome.