Tarantool development patches archive
 help / color / mirror / Atom feed
From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
To: tarantool-patches@freelists.org,
	Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH v1 4/5] box: introduce path_hash and tuple_field tree
Date: Wed, 15 Aug 2018 15:14:45 +0300	[thread overview]
Message-ID: <16d44dde-cbc5-ee5e-95b3-1b77d778df51@tarantool.org> (raw)
In-Reply-To: <37cd15a8-8060-8e2b-0de1-b478df10c116@tarantool.org>

> 1. A commit of a patchset should not fix errors introduced in
> the same patchset. Please, merge this hunk into the previous commit.
No such problem already.

> 2. For flags and flag calculators please use 'is/are'.
Ok, fixed.

> 3. The 'for' consists of two lines, so should have {}.
Ok, fixed.

> 4. I found a problem with your keys extractor and tried to
> reproduce it with a simple test but got another crash even
> earlier than I expected:
Thanks, fixed as a part of global reworking.

> But this is not what I had wanted to test. The initial
> problem I found is about how you extract keys: on extract key
> slowpath functions (both raw and not) you iterate over
> the very first level of a tuple and treat its entire
> fields as key parts, but it is wrong now. In tuple key
> extractor you should dig into the field when it is a
> JSON path part. And this is where you again need the
> template parameter 'is_flat' to omit such check for flat
> indexes.
> 
> The original test I wanted to run was this:
> 
> 	box.cfg{}
> 	s = box.schema.create_space('test')
> 	parts = {}
> 	parts[1] = {'[1][1]', 'unsigned'}
> 	pk = s:create_index('pk', {parts = parts})
> 	s:insert{{1, 1}}
> 	s:upsert({{1}}, {{'+', 2, 1}})
> 
> Here, I think, upsert will try to extract key 1, but
> will actually extract entire {1} and crash in some place.
No such problem already. I've included such tests in my patchset.

> 5. Exactly the same hash already exists in assoc.h as mh_strnptr, as
> I said you verbally.
Ok, I use it now.

> 6. You do not need to announce these functions. Just move the
> implementation here.
I need them defined in header for now. 

> 7. Use mh_strnptr_find_inp.
Ok, everywhere when possible.

> 8. Realloc works ok with NULL in the first argument, so you
> can merge this if-else branching. If part->num > size then realloc
> + memset of the gap. No special calloc.
Ok, done.

> 9. Please, preallocate the needed hash elements count once, using
> mh reserve() function.
Ok, I do this for format hash now.

> 10. Please, allocate json paths memory in the same block as the format.
> See tuple_format_alloc. This allows to do not do strdups here, do not
> use a 'free_path' flag in json_path_hash_delete, and reduces memory
> fragmentation.
Ok, done.

> 11. When a field is array, you can iterate over its fields and check
> if corresponding field->array[i] is not null. This allows to iterate
> over the array only once, while now you rescan it O(N) times where
> N is size of the field->array.
Good idea, I've done it.
> 
> 12. For the map above you can do another thing. Each time you do
> go_to_key() you can remember the key index (number of iterations
> inside go_to_key()) and the result pointer. During the cycle you
> calculate the maximal key index and its pointer. And then you
> finish the field decoding starting from this point. This is a
> strong optimization. For example, now for such field
> 
> {a = {b = {c = d, e = f}, g = h}}
> 
> and key in a.b.c you decode it as
> 
> {a = {b = {c = d, e = f}, g = h}}
> {b = {c = d, e = f}
> {c = d, e = f}
> 
> This is because after you found the key you still need to
> skip the entire map field to find the next one (see the next
> comment).
> 
> With my method you decode it as
> 
> {a = {b = {c = d, e = f}, g = h}}
If I've correctly understood this feature is already implemented in my patch.

> 13. Now you rescan the field pointed by pos twice - first
> time inside tuple_init_json_field_map, second in the mp_next below.
> You should not.
External mp_next moves across fields; at the same time  tuple_init_json_field_map (called 
json_field_tree_exec_routine now) iterates iside of it.
I don't know, how could I use internal offset to go to next field.
[{}, {}]
Believe that this would not the only thing that I'll fix next time.

> 14. As I said you verbally, instead of recalculating hash from
> the path you can store it in the key_part together with slot cache
> and epoch. And use light version of json_path_hash_get which takes
> a path with already calculated hash.
Ok, done.

>> +struct tuple_field_map;
> 15. What is it?
crap. dropped.
> 16. What is it? I do not see where is it
> assigned or used.
dropped.

>> -	 * Offset slot in field map in tuple. Normally tuple
>> +	 * Offset slot ix`n field map in tuple. Normally tuple
> 
> 17. !?
Fixed.
> 18. Sorry, but it is too little number of tests. You even did not
> test primary JSON index. As well as key extractions (upsert, vinyl replace/update etc),
> tuple hashes (via dumps in vinyl for example) etc.
I've wrote many pretty tests.

  reply	other threads:[~2018-08-15 12:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-06 12:26 [tarantool-patches] [PATCH v1 0/5] box: indexes by JSON path Kirill Shcherbatov
2018-08-06 12:26 ` [tarantool-patches] [PATCH v1 1/5] rfc: describe a Tarantool JSON indexes Kirill Shcherbatov
2018-08-06 12:26 ` [tarantool-patches] [PATCH v1 2/5] box: introduce slot_cache in key_part Kirill Shcherbatov
2018-08-08 22:03   ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-15 12:14     ` Kirill Shcherbatov
2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 3/5] box: introduce path field " Kirill Shcherbatov
2018-08-08 22:03   ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-15 12:14     ` Kirill Shcherbatov
2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 4/5] box: introduce path_hash and tuple_field tree Kirill Shcherbatov
2018-08-08 22:03   ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-15 12:14     ` Kirill Shcherbatov [this message]
2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
2018-08-08 22:03 ` [tarantool-patches] Re: [PATCH v1 0/5] box: indexes by JSON path Vladislav Shpilevoy
2018-08-15 12:14   ` Kirill Shcherbatov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16d44dde-cbc5-ee5e-95b3-1b77d778df51@tarantool.org \
    --to=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='[tarantool-patches] Re: [PATCH v1 4/5] box: introduce path_hash and tuple_field tree' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox