[tarantool-patches] Re: [PATCH v1 3/5] box: introduce path field in key_part

Kirill Shcherbatov kshcherbatov at tarantool.org
Wed Aug 15 15:14:45 MSK 2018


> 1. Please, keep monolithic key_def memory layout.
Yep, good idea. New patch version attached.

> 2. As I said you at least two times, strncmp treats non-equal strings
> as the same one when the one is prefix of another. So this comparison
> is not correct.
-               if ((rc = strncmp(part1->path, part2->path, len)) != 0)
+               if ((rc = memcmp(part1->path, part2->path, len)) != 0)


> 3. Just do ++part on the previous line.
-       struct key_part_def *part;
-       for (uint32_t i = 0; i < part_count; i++) {
-               part = &parts[i];
+       struct key_part_def *part = parts;
+       for (uint32_t i = 0; i < part_count; i++, part++) {


> 4. As I understand, len of ".a" is 2, len of "['a']" is 5. So
> it is actually 2*len + 1. Not 3*len + 1. It is not?
No, if I not mistaken, 2len+1 is not enough at worst scenario:
let |str| - length(str); new string is str' -> |str'| - length of the new string
let n_lex - count of lexemes in string (like .a that is the minimal construction)
let w_{i}^{ex} - whole length of original lexeme i, w_{i} - length of the string part without special characters.

|str| = \sum_{n=1}^{n_lex} w_{i}^{ex} = \sum_{n=1}^{n_lex}(1 + w_{i}) = 		n_lex + \sum_{1}^{n_lex}w_{i} = n_lex + S
|str'| = \sum_{n=1}^{n_lex}(4 + w_{i}) =
		4n_lex + \sum_{1}^{n_lex}w_{i} = 4n_lex + S
|str'| = 4n_lex + |str| - n_lex = 3n_lex + |str|

At the worst scenario, n_lex = \frac{|str|}{2}
|str'| = 3*\frac{|str|}{2} + |str| = \frac{5}{2}*|str| = 2|str| + \frac{1}{2}*|str|
That means that for |str| > 2 : 2|str| + 1 <   2|str| + \frac{1}{2}*|str|

We can use 3|str| instead of 3|str| + 1, but believe that 1 byte is nothing, but here
it looks like terminating-zero slot.

e.g.
|.a.a.a.a| = 4*2 = n*2 = 8
|["a"]["a"]["a"]["a"]| = 4*5 = n*5 = 20
2*|str| + 1 = 2*8 + 1 = 17 i.e. segfault

> 5. You do not need to truncate the region yourself. This is done
> later by commit/rollback. But you can reuse previous the region
> allocations if store the region memory pointer out of this cycle
> together size. And when a new path len * 2 + 1 > than the memory
> size you have, you allocate more. Else you can reuse it. In the
> best case it reduces number of allocations in part_count times,
> in the worst one it is not worse than your way.
Good idea,
I started to reuse the biggest of two items: the rest of 3|str| + 1 new string or old |str| string

> 6. As array index. Not array.
fixed.
-                                       "be defined as array");
+                                     "be defined as array index");


> 7. Please, reduce indentation level. You can check
> 'if part->path == NULL then continue' and move this
> code block 1 tab left. Even now you can join some of
> lines into one.
This looks ungly.
I've moved all this code to a new function:
static int
key_def_normalize_json_path(struct region *region, struct key_part_def *part,
			    char **path_extra, uint32_t *path_extra_size);


> 8. Please, write on which position the error occurred.
ok, done.




More information about the Tarantool-patches mailing list