Tarantool development patches archive
 help / color / mirror / Atom feed
From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
To: tarantool-patches@freelists.org,
	Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH v1 3/5] box: introduce path field in key_part
Date: Wed, 15 Aug 2018 15:14:45 +0300	[thread overview]
Message-ID: <9f38e2d5-f540-f8d8-3619-daa346b0a70b@tarantool.org> (raw)
In-Reply-To: <295310f0-1590-0b32-47f7-3231f925e19a@tarantool.org>

> 1. Please, keep monolithic key_def memory layout.
Yep, good idea. New patch version attached.

> 2. As I said you at least two times, strncmp treats non-equal strings
> as the same one when the one is prefix of another. So this comparison
> is not correct.
-               if ((rc = strncmp(part1->path, part2->path, len)) != 0)
+               if ((rc = memcmp(part1->path, part2->path, len)) != 0)


> 3. Just do ++part on the previous line.
-       struct key_part_def *part;
-       for (uint32_t i = 0; i < part_count; i++) {
-               part = &parts[i];
+       struct key_part_def *part = parts;
+       for (uint32_t i = 0; i < part_count; i++, part++) {


> 4. As I understand, len of ".a" is 2, len of "['a']" is 5. So
> it is actually 2*len + 1. Not 3*len + 1. It is not?
No, if I not mistaken, 2len+1 is not enough at worst scenario:
let |str| - length(str); new string is str' -> |str'| - length of the new string
let n_lex - count of lexemes in string (like .a that is the minimal construction)
let w_{i}^{ex} - whole length of original lexeme i, w_{i} - length of the string part without special characters.

|str| = \sum_{n=1}^{n_lex} w_{i}^{ex} = \sum_{n=1}^{n_lex}(1 + w_{i}) = 		n_lex + \sum_{1}^{n_lex}w_{i} = n_lex + S
|str'| = \sum_{n=1}^{n_lex}(4 + w_{i}) =
		4n_lex + \sum_{1}^{n_lex}w_{i} = 4n_lex + S
|str'| = 4n_lex + |str| - n_lex = 3n_lex + |str|

At the worst scenario, n_lex = \frac{|str|}{2}
|str'| = 3*\frac{|str|}{2} + |str| = \frac{5}{2}*|str| = 2|str| + \frac{1}{2}*|str|
That means that for |str| > 2 : 2|str| + 1 <   2|str| + \frac{1}{2}*|str|

We can use 3|str| instead of 3|str| + 1, but believe that 1 byte is nothing, but here
it looks like terminating-zero slot.

e.g.
|.a.a.a.a| = 4*2 = n*2 = 8
|["a"]["a"]["a"]["a"]| = 4*5 = n*5 = 20
2*|str| + 1 = 2*8 + 1 = 17 i.e. segfault

> 5. You do not need to truncate the region yourself. This is done
> later by commit/rollback. But you can reuse previous the region
> allocations if store the region memory pointer out of this cycle
> together size. And when a new path len * 2 + 1 > than the memory
> size you have, you allocate more. Else you can reuse it. In the
> best case it reduces number of allocations in part_count times,
> in the worst one it is not worse than your way.
Good idea,
I started to reuse the biggest of two items: the rest of 3|str| + 1 new string or old |str| string

> 6. As array index. Not array.
fixed.
-                                       "be defined as array");
+                                     "be defined as array index");


> 7. Please, reduce indentation level. You can check
> 'if part->path == NULL then continue' and move this
> code block 1 tab left. Even now you can join some of
> lines into one.
This looks ungly.
I've moved all this code to a new function:
static int
key_def_normalize_json_path(struct region *region, struct key_part_def *part,
			    char **path_extra, uint32_t *path_extra_size);


> 8. Please, write on which position the error occurred.
ok, done.

  reply	other threads:[~2018-08-15 12:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-06 12:26 [tarantool-patches] [PATCH v1 0/5] box: indexes by JSON path Kirill Shcherbatov
2018-08-06 12:26 ` [tarantool-patches] [PATCH v1 1/5] rfc: describe a Tarantool JSON indexes Kirill Shcherbatov
2018-08-06 12:26 ` [tarantool-patches] [PATCH v1 2/5] box: introduce slot_cache in key_part Kirill Shcherbatov
2018-08-08 22:03   ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-15 12:14     ` Kirill Shcherbatov
2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 3/5] box: introduce path field " Kirill Shcherbatov
2018-08-08 22:03   ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-15 12:14     ` Kirill Shcherbatov [this message]
2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 4/5] box: introduce path_hash and tuple_field tree Kirill Shcherbatov
2018-08-08 22:03   ` [tarantool-patches] " Vladislav Shpilevoy
2018-08-15 12:14     ` Kirill Shcherbatov
2018-08-06 12:27 ` [tarantool-patches] [PATCH v1 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
2018-08-08 22:03 ` [tarantool-patches] Re: [PATCH v1 0/5] box: indexes by JSON path Vladislav Shpilevoy
2018-08-15 12:14   ` Kirill Shcherbatov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f38e2d5-f540-f8d8-3619-daa346b0a70b@tarantool.org \
    --to=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='[tarantool-patches] Re: [PATCH v1 3/5] box: introduce path field in key_part' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox