From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 706656D3F5; Mon, 25 Oct 2021 11:40:50 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 706656D3F5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1635151250; bh=YprkPGcIW7H6v8Vh9J+dunfX4ZoVl63QPd/yRLAZzQk=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=aIBCOvypLnM5nKApv/QpgUpPUoXdMZJikHtzDcfU/pGIBhYURerFRPWYTEjd17wXY kgIt+NkLcvyWPKnXD3B87ZhRkMTEVj7i0rPuT6FhpYcnqgFPE7grpmPKjpk9uzGGQ6 zpAG+eSmVcNzgl/MkzsXp2cUkcXweUnsktGfMdnc= Received: from smtpng1.i.mail.ru (smtpng1.i.mail.ru [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 0D6396D3F5 for ; Mon, 25 Oct 2021 11:40:49 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0D6396D3F5 Received: by smtpng1.m.smailru.net with esmtpa (envelope-from ) id 1mevXE-00038D-FC; Mon, 25 Oct 2021 11:40:48 +0300 Date: Mon, 25 Oct 2021 11:40:47 +0300 To: Vladislav Shpilevoy Cc: tarantool-patches@dev.tarantool.org Message-ID: <20211025084047.GJ36295@tarantool.org> References: <0c9dfd1b90d72b6756c4a1312ed4c085e9255e47.1633713432.git.imeevma@gmail.com> <77fcefe4-cde9-ed81-4764-c539fe51ea23@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <77fcefe4-cde9-ed81-4764-c539fe51ea23@tarantool.org> X-4EC0790: 10 X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD9C7814344C8C501C83E7F929BE7EA671116302D09E268010D182A05F538085040CFF0A9C62B5A85FC0F7AB1ECA805ABC243581D359F67B401BCBB9588D3293F00 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE712EB008F780777E9EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006371750936FC250F8708638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8FC421409B70CC8D05E62C08C6DD7596B117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BF1175FABE1C0F9B6A471835C12D1D977C4224003CC8364762BB6847A3DEAEFB0F43C7A68FF6260569E8FC8737B5C2249EC8D19AE6D49635B68655334FD4449CB9ECD01F8117BC8BEAAAE862A0553A39223F8577A6DFFEA7CD1D040B6C1ECEA3F43847C11F186F3C59DAA53EE0834AAEE X-C1DE0DAB: 0D63561A33F958A5FA35F5B9622632F39494941AEEE6CE9CDD80FB283800FAF3D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75FA7FF33AA1A4D21C410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D349FF8F8245A2FAA7B92083F74361A7C66E678EB3C07F7B301059C911C01FA5C83877DCFC5F4B40DB61D7E09C32AA3244C7DCA42CD18E74B1E779F412302F4892495A9E0DC41E9A4CF729B2BEF169E0186 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojPL6H901iH3FYzT7T1i/N8g== X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5DC47314E1DC3637F4EAE60BE2D99BFEB983D72C36FC87018B9F80AB2734326CD2FB559BB5D741EB96352A0ABBE4FDA4210A04DAD6CC59E33667EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v1 14/21] sql: refactor UNICODE() function X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Mergen Imeev via Tarantool-patches Reply-To: Mergen Imeev Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Thank you for the review! My answer, diff and new patch below. Also, I added description to the function. On Fri, Oct 15, 2021 at 12:44:37AM +0200, Vladislav Shpilevoy wrote: > Thanks for the patch! > > > diff --git a/src/box/sql/func.c b/src/box/sql/func.c > > index fb7fd772e..5e12ef729 100644 > > --- a/src/box/sql/func.c > > +++ b/src/box/sql/func.c > > @@ -1007,6 +1007,19 @@ func_version(struct sql_context *ctx, int argc, struct Mem *argv) > > return mem_set_str0_static(ctx->pOut, (char *)tarantool_version()); > > } > > > > +/** Implementation of the UNICODE() function. */ > > +static void > > +func_unicode(struct sql_context *ctx, int argc, struct Mem *argv) > > +{ > > + assert(argc == 1); > > + (void)argc; > > + if (mem_is_null(&argv[0])) > > + return; > > + assert(mem_is_str(&argv[0])); > > + const char *str = tt_cstr(argv[0].z, argv[0].n); > > + mem_set_uint(ctx->pOut, sqlUtf8Read((const unsigned char **)&str)); > > You can dodge the copying. See utf8_next() in utf8.c: > > UChar32 c; > U8_NEXT(str, pos, len, c); Thanks, fixed. Diff: diff --git a/src/box/sql/func.c b/src/box/sql/func.c index ebc38751e..6d80559d5 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -1016,17 +1016,28 @@ func_version(struct sql_context *ctx, int argc, struct Mem *argv) return mem_set_str0_static(ctx->pOut, (char *)tarantool_version()); } -/** Implementation of the UNICODE() function. */ +/** + * Implementation of the UNICODE() function. + * + * Return the Unicode code point value for the first character of the input + * string. + */ static void func_unicode(struct sql_context *ctx, int argc, struct Mem *argv) { assert(argc == 1); (void)argc; - if (mem_is_null(&argv[0])) + struct Mem *arg = &argv[0]; + if (mem_is_null(arg)) return; - assert(mem_is_str(&argv[0])); - const char *str = tt_cstr(argv[0].z, argv[0].n); - mem_set_uint(ctx->pOut, sqlUtf8Read((const unsigned char **)&str)); + assert(mem_is_str(arg)); + if (arg->n == 0) + return mem_set_uint(ctx->pOut, 0); + int pos = 0; + UChar32 c; + U8_NEXT(arg->z, pos, arg->n, c); + (void)pos; + mem_set_uint(ctx->pOut, (uint64_t)c); } static const unsigned char * New patch: commit 6346c542b8c81814753a1853d7ae347222af0f23 Author: Mergen Imeev Date: Thu Oct 7 13:43:38 2021 +0300 sql: refactor UNICODE() function Part of #4145 diff --git a/src/box/sql/func.c b/src/box/sql/func.c index 3afc8ec7f..6d80559d5 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -1016,6 +1016,30 @@ func_version(struct sql_context *ctx, int argc, struct Mem *argv) return mem_set_str0_static(ctx->pOut, (char *)tarantool_version()); } +/** + * Implementation of the UNICODE() function. + * + * Return the Unicode code point value for the first character of the input + * string. + */ +static void +func_unicode(struct sql_context *ctx, int argc, struct Mem *argv) +{ + assert(argc == 1); + (void)argc; + struct Mem *arg = &argv[0]; + if (mem_is_null(arg)) + return; + assert(mem_is_str(arg)); + if (arg->n == 0) + return mem_set_uint(ctx->pOut, 0); + int pos = 0; + UChar32 c; + U8_NEXT(arg->z, pos, arg->n, c); + (void)pos; + mem_set_uint(ctx->pOut, (uint64_t)c); +} + static const unsigned char * mem_as_ustr(struct Mem *mem) { @@ -1437,19 +1461,6 @@ quoteFunc(struct sql_context *context, int argc, struct Mem *argv) } } -/* - * The unicode() function. Return the integer unicode code-point value - * for the first character of the input string. - */ -static void -unicodeFunc(struct sql_context *context, int argc, struct Mem *argv) -{ - const unsigned char *z = mem_as_ustr(&argv[0]); - (void)argc; - if (z && z[0]) - sql_result_uint(context, sqlUtf8Read(&z)); -} - /* * The replace() function. Three arguments are all strings: call * them A, B, and C. The result is also a string which is derived @@ -1883,7 +1894,7 @@ static struct sql_func_definition definitions[] = { FIELD_TYPE_VARBINARY, func_trim_bin, NULL}, {"TYPEOF", 1, {FIELD_TYPE_ANY}, FIELD_TYPE_STRING, func_typeof, NULL}, - {"UNICODE", 1, {FIELD_TYPE_STRING}, FIELD_TYPE_INTEGER, unicodeFunc, + {"UNICODE", 1, {FIELD_TYPE_STRING}, FIELD_TYPE_INTEGER, func_unicode, NULL}, {"UNLIKELY", 1, {FIELD_TYPE_ANY}, FIELD_TYPE_BOOLEAN, sql_builtin_stub, NULL},