From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 59A9528DCC for ; Wed, 15 Aug 2018 08:14:47 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KNk0cTFrDVqx for ; Wed, 15 Aug 2018 08:14:47 -0400 (EDT) Received: from smtp56.i.mail.ru (smtp56.i.mail.ru [217.69.128.36]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 185E728BF0 for ; Wed, 15 Aug 2018 08:14:46 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH v1 4/5] box: introduce path_hash and tuple_field tree References: <34eae62c9453063f7ff90ce6fe6a7feba0440ca9.1533558332.git.kshcherbatov@tarantool.org> <37cd15a8-8060-8e2b-0de1-b478df10c116@tarantool.org> From: Kirill Shcherbatov Message-ID: <16d44dde-cbc5-ee5e-95b3-1b77d778df51@tarantool.org> Date: Wed, 15 Aug 2018 15:14:45 +0300 MIME-Version: 1.0 In-Reply-To: <37cd15a8-8060-8e2b-0de1-b478df10c116@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, Vladislav Shpilevoy > 1. A commit of a patchset should not fix errors introduced in > the same patchset. Please, merge this hunk into the previous commit. No such problem already. > 2. For flags and flag calculators please use 'is/are'. Ok, fixed. > 3. The 'for' consists of two lines, so should have {}. Ok, fixed. > 4. I found a problem with your keys extractor and tried to > reproduce it with a simple test but got another crash even > earlier than I expected: Thanks, fixed as a part of global reworking. > But this is not what I had wanted to test. The initial > problem I found is about how you extract keys: on extract key > slowpath functions (both raw and not) you iterate over > the very first level of a tuple and treat its entire > fields as key parts, but it is wrong now. In tuple key > extractor you should dig into the field when it is a > JSON path part. And this is where you again need the > template parameter 'is_flat' to omit such check for flat > indexes. > > The original test I wanted to run was this: > > box.cfg{} > s = box.schema.create_space('test') > parts = {} > parts[1] = {'[1][1]', 'unsigned'} > pk = s:create_index('pk', {parts = parts}) > s:insert{{1, 1}} > s:upsert({{1}}, {{'+', 2, 1}}) > > Here, I think, upsert will try to extract key 1, but > will actually extract entire {1} and crash in some place. No such problem already. I've included such tests in my patchset. > 5. Exactly the same hash already exists in assoc.h as mh_strnptr, as > I said you verbally. Ok, I use it now. > 6. You do not need to announce these functions. Just move the > implementation here. I need them defined in header for now. > 7. Use mh_strnptr_find_inp. Ok, everywhere when possible. > 8. Realloc works ok with NULL in the first argument, so you > can merge this if-else branching. If part->num > size then realloc > + memset of the gap. No special calloc. Ok, done. > 9. Please, preallocate the needed hash elements count once, using > mh reserve() function. Ok, I do this for format hash now. > 10. Please, allocate json paths memory in the same block as the format. > See tuple_format_alloc. This allows to do not do strdups here, do not > use a 'free_path' flag in json_path_hash_delete, and reduces memory > fragmentation. Ok, done. > 11. When a field is array, you can iterate over its fields and check > if corresponding field->array[i] is not null. This allows to iterate > over the array only once, while now you rescan it O(N) times where > N is size of the field->array. Good idea, I've done it. > > 12. For the map above you can do another thing. Each time you do > go_to_key() you can remember the key index (number of iterations > inside go_to_key()) and the result pointer. During the cycle you > calculate the maximal key index and its pointer. And then you > finish the field decoding starting from this point. This is a > strong optimization. For example, now for such field > > {a = {b = {c = d, e = f}, g = h}} > > and key in a.b.c you decode it as > > {a = {b = {c = d, e = f}, g = h}} > {b = {c = d, e = f} > {c = d, e = f} > > This is because after you found the key you still need to > skip the entire map field to find the next one (see the next > comment). > > With my method you decode it as > > {a = {b = {c = d, e = f}, g = h}} If I've correctly understood this feature is already implemented in my patch. > 13. Now you rescan the field pointed by pos twice - first > time inside tuple_init_json_field_map, second in the mp_next below. > You should not. External mp_next moves across fields; at the same time tuple_init_json_field_map (called json_field_tree_exec_routine now) iterates iside of it. I don't know, how could I use internal offset to go to next field. [{}, {}] Believe that this would not the only thing that I'll fix next time. > 14. As I said you verbally, instead of recalculating hash from > the path you can store it in the key_part together with slot cache > and epoch. And use light version of json_path_hash_get which takes > a path with already calculated hash. Ok, done. >> +struct tuple_field_map; > 15. What is it? crap. dropped. > 16. What is it? I do not see where is it > assigned or used. dropped. >> - * Offset slot in field map in tuple. Normally tuple >> + * Offset slot ix`n field map in tuple. Normally tuple > > 17. !? Fixed. > 18. Sorry, but it is too little number of tests. You even did not > test primary JSON index. As well as key extractions (upsert, vinyl replace/update etc), > tuple hashes (via dumps in vinyl for example) etc. I've wrote many pretty tests.