From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id DEEA528DFC for ; Wed, 15 Aug 2018 08:14:47 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8aNnx6e5V0gh for ; Wed, 15 Aug 2018 08:14:47 -0400 (EDT) Received: from smtp16.mail.ru (smtp16.mail.ru [94.100.176.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 9BCA128DEF for ; Wed, 15 Aug 2018 08:14:47 -0400 (EDT) From: Kirill Shcherbatov Subject: [tarantool-patches] Re: [PATCH v1 3/5] box: introduce path field in key_part References: <63cb256a0206e222ee10199e7d671700e44ce2aa.1533558332.git.kshcherbatov@tarantool.org> <295310f0-1590-0b32-47f7-3231f925e19a@tarantool.org> Message-ID: <9f38e2d5-f540-f8d8-3619-daa346b0a70b@tarantool.org> Date: Wed, 15 Aug 2018 15:14:45 +0300 MIME-Version: 1.0 In-Reply-To: <295310f0-1590-0b32-47f7-3231f925e19a@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, Vladislav Shpilevoy > 1. Please, keep monolithic key_def memory layout. Yep, good idea. New patch version attached. > 2. As I said you at least two times, strncmp treats non-equal strings > as the same one when the one is prefix of another. So this comparison > is not correct. - if ((rc = strncmp(part1->path, part2->path, len)) != 0) + if ((rc = memcmp(part1->path, part2->path, len)) != 0) > 3. Just do ++part on the previous line. - struct key_part_def *part; - for (uint32_t i = 0; i < part_count; i++) { - part = &parts[i]; + struct key_part_def *part = parts; + for (uint32_t i = 0; i < part_count; i++, part++) { > 4. As I understand, len of ".a" is 2, len of "['a']" is 5. So > it is actually 2*len + 1. Not 3*len + 1. It is not? No, if I not mistaken, 2len+1 is not enough at worst scenario: let |str| - length(str); new string is str' -> |str'| - length of the new string let n_lex - count of lexemes in string (like .a that is the minimal construction) let w_{i}^{ex} - whole length of original lexeme i, w_{i} - length of the string part without special characters. |str| = \sum_{n=1}^{n_lex} w_{i}^{ex} = \sum_{n=1}^{n_lex}(1 + w_{i}) = n_lex + \sum_{1}^{n_lex}w_{i} = n_lex + S |str'| = \sum_{n=1}^{n_lex}(4 + w_{i}) = 4n_lex + \sum_{1}^{n_lex}w_{i} = 4n_lex + S |str'| = 4n_lex + |str| - n_lex = 3n_lex + |str| At the worst scenario, n_lex = \frac{|str|}{2} |str'| = 3*\frac{|str|}{2} + |str| = \frac{5}{2}*|str| = 2|str| + \frac{1}{2}*|str| That means that for |str| > 2 : 2|str| + 1 < 2|str| + \frac{1}{2}*|str| We can use 3|str| instead of 3|str| + 1, but believe that 1 byte is nothing, but here it looks like terminating-zero slot. e.g. |.a.a.a.a| = 4*2 = n*2 = 8 |["a"]["a"]["a"]["a"]| = 4*5 = n*5 = 20 2*|str| + 1 = 2*8 + 1 = 17 i.e. segfault > 5. You do not need to truncate the region yourself. This is done > later by commit/rollback. But you can reuse previous the region > allocations if store the region memory pointer out of this cycle > together size. And when a new path len * 2 + 1 > than the memory > size you have, you allocate more. Else you can reuse it. In the > best case it reduces number of allocations in part_count times, > in the worst one it is not worse than your way. Good idea, I started to reuse the biggest of two items: the rest of 3|str| + 1 new string or old |str| string > 6. As array index. Not array. fixed. - "be defined as array"); + "be defined as array index"); > 7. Please, reduce indentation level. You can check > 'if part->path == NULL then continue' and move this > code block 1 tab left. Even now you can join some of > lines into one. This looks ungly. I've moved all this code to a new function: static int key_def_normalize_json_path(struct region *region, struct key_part_def *part, char **path_extra, uint32_t *path_extra_size); > 8. Please, write on which position the error occurred. ok, done.