From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id BC96E6EC55; Sat, 9 Oct 2021 00:57:02 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BC96E6EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1633730222; bh=bz767efShwr7fzVz3kmrsYNuf29joBm88lQUa7LlYvQ=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bspuv98DEzhKE4V37QfNCCtUCbE4pTrsnM6bSl7TCVrCz45e734X6narDsuhld+Tl Oaf4Q7YbF+LGxcv4O89t7UR53zi+iD8XHH1VAxPvU/lFsPV07DCc274b0XtmZZCpjn Bu0DRUpvMCyR6qnzTuweMKj/ILxTIHCqo34qm0Dk= Received: from smtpng3.i.mail.ru (smtpng3.i.mail.ru [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id CF5B16EC55 for ; Sat, 9 Oct 2021 00:57:00 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org CF5B16EC55 Received: by smtpng3.m.smailru.net with esmtpa (envelope-from ) id 1mYxrQ-0006ej-63; Sat, 09 Oct 2021 00:57:00 +0300 Message-ID: Date: Fri, 8 Oct 2021 23:56:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 To: imeevma@tarantool.org Cc: tarantool-patches@dev.tarantool.org References: <9cc35ba4625d4e3017725c35fbc4a7ed90341917.1633105483.git.imeevma@gmail.com> Content-Language: en-US In-Reply-To: <9cc35ba4625d4e3017725c35fbc4a7ed90341917.1633105483.git.imeevma@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9A6D4E3B1981C4C7D860EA8B700EC2F3556310D6870B6F840182A05F53808504094BD91B26E04CEBF61F0E18F4693EE87E66633F2668E4AF1AFF8068BB3D941B1 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE73C871DD2182510D5EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637DA5CEC9EE7F170198638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D83F825273D0C1B61F82FDDE2105E83177117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BAA867293B0326636D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B613439FA09F3DCB32089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A213B5FB47DCBC3458834459D11680B505178C323EEE45813B0B5D283B3486E765 X-C1DE0DAB: 0D63561A33F958A54841EC9535CEDF5FECFD076415362356004A85BD46E09FCFD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA759D2A03B9C34326B3410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D340B06C327CE4A70E8880A5621C56C16B651A3251DD6410A39E0C5C35C823A6B4BDEA373A69C7762291D7E09C32AA3244C1EBE4349F2F2273EEF120AA4F60CCEE96C24832127668422729B2BEF169E0186 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojMZ06aokA6br/4qWtkpWbHg== X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5D856B8864712FF2F9437C5C9F9BDA372F3841015FED1DE5223CC9A89AB576DD93FB559BB5D741EB963CF37A108A312F5C27E8A8C3839CE0E267EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v1 2/8] sql: refactor CHAR_LENGTH() function X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Thanks for the patch! On 01.10.2021 18:29, imeevma@tarantool.org wrote: > Part of #4145 > --- > src/box/sql/func.c | 38 +++++++++++++++++++++++++++++++++++--- > 1 file changed, 35 insertions(+), 3 deletions(-) > > diff --git a/src/box/sql/func.c b/src/box/sql/func.c > index 54b03f359..2e53b32d8 100644 > --- a/src/box/sql/func.c > +++ b/src/box/sql/func.c > @@ -263,6 +263,38 @@ func_abs_double(struct sql_context *ctx, int argc, struct Mem *argv) > mem_set_double(ctx->pOut, arg->u.r < 0 ? -arg->u.r : arg->u.r); > } > > +/** Implementation of the CHAR_LENGTH() function. */ > +static inline uint8_t > +utf8_len_char(char c) > +{ > + uint8_t u = (uint8_t)c; > + return 1 + (u >= 0xc2) + (u >= 0xe0) + (u >= 0xf0); It is not that simple really. Consider either using the old lengthFunc() and other sqlite utf8 helpers or use the approach similar to utf8_len() in utf8.c. It uses ICU macro U8_NEXT() and has handling for special symbols like U_SENTINEL. Otherwise you are making already third version of functions to work with utf8. I would even prefer to refactor lengthFunc() to stop using sqlite legacy and drop sqlite utf8 entirely, but I suspect it might be not so trivial to do and should be done later.