From: Kirill Shcherbatov <kshcherbatov@tarantool.org> To: tarantool-patches@freelists.org, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Subject: [tarantool-patches] Re: [PATCH v1 4/5] box: introduce path_hash and tuple_field tree Date: Wed, 15 Aug 2018 15:14:45 +0300 [thread overview] Message-ID: <16d44dde-cbc5-ee5e-95b3-1b77d778df51@tarantool.org> (raw) In-Reply-To: <37cd15a8-8060-8e2b-0de1-b478df10c116@tarantool.org> > 1. A commit of a patchset should not fix errors introduced in > the same patchset. Please, merge this hunk into the previous commit. No such problem already. > 2. For flags and flag calculators please use 'is/are'. Ok, fixed. > 3. The 'for' consists of two lines, so should have {}. Ok, fixed. > 4. I found a problem with your keys extractor and tried to > reproduce it with a simple test but got another crash even > earlier than I expected: Thanks, fixed as a part of global reworking. > But this is not what I had wanted to test. The initial > problem I found is about how you extract keys: on extract key > slowpath functions (both raw and not) you iterate over > the very first level of a tuple and treat its entire > fields as key parts, but it is wrong now. In tuple key > extractor you should dig into the field when it is a > JSON path part. And this is where you again need the > template parameter 'is_flat' to omit such check for flat > indexes. > > The original test I wanted to run was this: > > box.cfg{} > s = box.schema.create_space('test') > parts = {} > parts[1] = {'[1][1]', 'unsigned'} > pk = s:create_index('pk', {parts = parts}) > s:insert{{1, 1}} > s:upsert({{1}}, {{'+', 2, 1}}) > > Here, I think, upsert will try to extract key 1, but > will actually extract entire {1} and crash in some place. No such problem already. I've included such tests in my patchset. > 5. Exactly the same hash already exists in assoc.h as mh_strnptr, as > I said you verbally. Ok, I use it now. > 6. You do not need to announce these functions. Just move the > implementation here. I need them defined in header for now. > 7. Use mh_strnptr_find_inp. Ok, everywhere when possible. > 8. Realloc works ok with NULL in the first argument, so you > can merge this if-else branching. If part->num > size then realloc > + memset of the gap. No special calloc. Ok, done. > 9. Please, preallocate the needed hash elements count once, using > mh reserve() function. Ok, I do this for format hash now. > 10. Please, allocate json paths memory in the same block as the format. > See tuple_format_alloc. This allows to do not do strdups here, do not > use a 'free_path' flag in json_path_hash_delete, and reduces memory > fragmentation. Ok, done. > 11. When a field is array, you can iterate over its fields and check > if corresponding field->array[i] is not null. This allows to iterate > over the array only once, while now you rescan it O(N) times where > N is size of the field->array. Good idea, I've done it. > > 12. For the map above you can do another thing. Each time you do > go_to_key() you can remember the key index (number of iterations > inside go_to_key()) and the result pointer. During the cycle you > calculate the maximal key index and its pointer. And then you > finish the field decoding starting from this point. This is a > strong optimization. For example, now for such field > > {a = {b = {c = d, e = f}, g = h}} > > and key in a.b.c you decode it as > > {a = {b = {c = d, e = f}, g = h}} > {b = {c = d, e = f} > {c = d, e = f} > > This is because after you found the key you still need to > skip the entire map field to find the next one (see the next > comment). > > With my method you decode it as > > {a = {b = {c = d, e = f}, g = h}} If I've correctly understood this feature is already implemented in my patch. > 13. Now you rescan the field pointed by pos twice - first > time inside tuple_init_json_field_map, second in the mp_next below. > You should not. External mp_next moves across fields; at the same time tuple_init_json_field_map (called json_field_tree_exec_routine now) iterates iside of it. I don't know, how could I use internal offset to go to next field. [{}, {}] Believe that this would not the only thing that I'll fix next time. > 14. As I said you verbally, instead of recalculating hash from > the path you can store it in the key_part together with slot cache > and epoch. And use light version of json_path_hash_get which takes > a path with already calculated hash. Ok, done. >> +struct tuple_field_map; > 15. What is it? crap. dropped. > 16. What is it? I do not see where is it > assigned or used. dropped. >> - * Offset slot in field map in tuple. Normally tuple >> + * Offset slot ix`n field map in tuple. Normally tuple > > 17. !? Fixed. > 18. Sorry, but it is too little number of tests. You even did not > test primary JSON index. As well as key extractions (upsert, vinyl replace/update etc), > tuple hashes (via dumps in vinyl for example) etc. I've wrote many pretty tests.
next prev parent reply other threads:[~2018-08-15 12:14 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-08-06 12:26 [tarantool-patches] [PATCH v1 0/5] box: indexes by JSON path Kirill Shcherbatov 2018-08-06 12:26 ` [tarantool-patches] [PATCH v1 1/5] rfc: describe a Tarantool JSON indexes Kirill Shcherbatov 2018-08-06 12:26 ` [tarantool-patches] [PATCH v1 2/5] box: introduce slot_cache in key_part Kirill Shcherbatov 2018-08-08 22:03 ` [tarantool-patches] " Vladislav Shpilevoy 2018-08-15 12:14 ` Kirill Shcherbatov 2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 3/5] box: introduce path field " Kirill Shcherbatov 2018-08-08 22:03 ` [tarantool-patches] " Vladislav Shpilevoy 2018-08-15 12:14 ` Kirill Shcherbatov 2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 4/5] box: introduce path_hash and tuple_field tree Kirill Shcherbatov 2018-08-08 22:03 ` [tarantool-patches] " Vladislav Shpilevoy 2018-08-15 12:14 ` Kirill Shcherbatov [this message] 2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov 2018-08-08 22:03 ` [tarantool-patches] Re: [PATCH v1 0/5] box: indexes by JSON path Vladislav Shpilevoy 2018-08-15 12:14 ` Kirill Shcherbatov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=16d44dde-cbc5-ee5e-95b3-1b77d778df51@tarantool.org \ --to=kshcherbatov@tarantool.org \ --cc=tarantool-patches@freelists.org \ --cc=v.shpilevoy@tarantool.org \ --subject='[tarantool-patches] Re: [PATCH v1 4/5] box: introduce path_hash and tuple_field tree' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox