From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 1CFE623D94 for ; Fri, 27 Apr 2018 21:10:47 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0Kb0hOIlocnu for ; Fri, 27 Apr 2018 21:10:46 -0400 (EDT) Received: from smtp59.i.mail.ru (smtp59.i.mail.ru [217.69.128.39]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id BC38C23D92 for ; Fri, 27 Apr 2018 21:10:46 -0400 (EDT) Date: Sat, 28 Apr 2018 04:10:56 +0300 From: Alexander Turenko Subject: [tarantool-patches] Re: [PATCH 2/7] lua: implement string.u_count Message-ID: <20180428011055.h55f36uitb7txrg2@tkn_work_nb> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: Vladislav Shpilevoy Cc: kostja@tarantool.org, tarantool-patches@freelists.org Just one tiny comment below. WBR, Alexander Turenko. On Thu, Apr 26, 2018 at 02:29:02AM +0300, Vladislav Shpilevoy wrote: > Lua can not calculate length of a unicode string correctly. But > Tarantool has ICU on board - lets use it to calculate length. > > u_count has options, that allows to count only symbols of a > specific class, for example, only capital letters, or digits. > Options can be combined. > > Closes #3081 > --- > extra/exports | 1 + > src/CMakeLists.txt | 1 + > src/lua/string.lua | 52 ++++++++++++++++++++++++++++++++++++++++++++ > src/util.c | 48 +++++++++++++++++++++++++++++++++++++++- > test/app-tap/string.test.lua | 22 ++++++++++++++++++- > 5 files changed, 122 insertions(+), 2 deletions(-) > > <...> > diff --git a/src/util.c b/src/util.c > index 9458695b9..c117dee05 100644 > --- a/src/util.c > +++ b/src/util.c > <...> > +/** > + * Get length of a UTF8 string. > + * @param s UTF8 string. > + * @param bsize Binary size of @an s. Whether it worth to clarify that it is w/o trailing '\0'? > + * @param flags Binary OR of u_count_class flags. > + * @retval >=0 Count of symbols matched one of @a flags. > + * @retval <0 Invalid UTF8 on the position -1 * returned value. > + */ > +int > +u_count(const char *s, int bsize, uint8_t flags) > +{