From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id C9B3D20C2B for ; Tue, 3 Apr 2018 06:20:59 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SwSOMMV8hiMT for ; Tue, 3 Apr 2018 06:20:59 -0400 (EDT) Received: from smtp14.mail.ru (smtp14.mail.ru [94.100.181.95]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 84BE8271BA for ; Tue, 3 Apr 2018 06:20:59 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH v2 3/3] Multibyte characters support References: <55E839FD-8366-4BA0-BDAF-5C13661E40F7@tarantool.org> From: Vladislav Shpilevoy Message-ID: <08b2862f-7741-8d62-c2e7-293964e69376@tarantool.org> Date: Tue, 3 Apr 2018 13:20:50 +0300 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, Kirill Shcherbatov Hello. Please, consider 9 comments. 1. Seems like you again sent the patch using diff copy-paste into mail client. Please, don't do that. Such "method" - destroys all tabs, converting them to spaces, which count != tab width, - skips commit message body. Now I look at the letter in commits mail list. 2. Please, write a commit message body and ref GitHub issue using 'Linked with #NNNN' or 'Part of #NNNN'. But actually your patch closes it, as I think. 3. In lbox_tuple_field_by_path you calculate path len when a field is not found - please, try use already calculated len from this place: > size_t path_len; > - const char *path = lua_tolstring(L, 2, &path_len); > + path = lua_tolstring(L, 2, &path_len); 4. I can not build the branch: > [ 15%] Building C object test/unit/CMakeFiles/heap.test.dir/heap.c.o > In file included from > /Users/v.shpilevoy/Work/Repositories/tarantool/src/lib/json/path.c:32: > /Users/v.shpilevoy/Work/Repositories/tarantool/src/lib/json/path.h:37:10: > fatal error: 'malloc.h' file not found > #include >          ^~~~~~~~~~ > [ 15%] Built target api > if (index < 0) { > not_found: > + if (!path) > + goto exit_not_found; > + uint32_t path_len = strlen(path); 5. The indentation looks broken. 6. Please, try to simplify the function. It looks very complex with 3 labels and a strange "mark". > /** + * Checks is multibyte character whose first byte + * is pointed > to by mb_str is alphabetic. + * NOTE: You have to clean global context > + * with mbtowc(NULL, 0, 0); before first call 7. Where you found this info? I can not find it. > +/** + * Counts user-string sign count in mb_str_size bytes + * @param > mb_str + * @param mb_str_size + * @return sign count + */ +static > inline int +mbtowc_count(struct json_path_parser *parser, const char > *mb_str, size_t mb_str_len) +{ + char src[mb_str_len+1]; 8. I very-very do not like allocation of arrays with variable length on a stack. Please, do not do that. It is bad even if you use this function for errors only. And please try to remove this function at all. I propose to calculate symbols count in struct json_path_parser. For example, you can add a member symbol_count, and increase it during each json_path_next call. 9. Looks, like you did not add a tests on complex cases, which we have discussed verbally, when a space format field has name looking like JSON paths. Please, add them. And I recommend you do not hurry. Speed of the patch pushing into master does not depend on speed of patch resending. And remember, that it will be very hot code, and it must extremely optimized.