From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <tarantool-patches-bounce@freelists.org> Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 09B3922E09 for <tarantool-patches@freelists.org>; Thu, 19 Jul 2018 07:14:34 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p1MOHxrlO9PC for <tarantool-patches@freelists.org>; Thu, 19 Jul 2018 07:14:33 -0400 (EDT) Received: from mail-lf0-f67.google.com (mail-lf0-f67.google.com [209.85.215.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 7D0F021F7B for <tarantool-patches@freelists.org>; Thu, 19 Jul 2018 07:14:33 -0400 (EDT) Received: by mail-lf0-f67.google.com with SMTP id a4-v6so5745140lff.5 for <tarantool-patches@freelists.org>; Thu, 19 Jul 2018 04:14:33 -0700 (PDT) MIME-Version: 1.0 References: <1530190036-10105-1-git-send-email-hollow653@gmail.com> <20180718024314.be245cmsgklxuvnk@tkn_work_nb> <CAEi+_apXUVK8QBrBDm05JYcNxYfB43R0vvsQ_2LHK5j+tv25tg@mail.gmail.com> <dcf5206b-892e-2def-a88a-8c8d6507dc57@tarantool.org> <CAEi+_aq3ibC1=3sN4On=P9rHOEB=mZexKEMo5rf1t=3aa6MBTg@mail.gmail.com> <20180718171024.ry2cltazdl2hsrpc@tkn_work_nb> In-Reply-To: <20180718171024.ry2cltazdl2hsrpc@tkn_work_nb> From: Nikita Tatunov <hollow653@gmail.com> Date: Thu, 19 Jul 2018 14:14:20 +0300 Message-ID: <CAEi+_aqNw=POs+A6e6jNhxR7Yb7Pgij9gd9df4TyX4iiSXG30A@mail.gmail.com> Subject: [tarantool-patches] Re: [PATCH] sql: LIKE & GLOB pattern comparison issue Content-Type: multipart/alternative; boundary="000000000000917c8305715848e2" Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: <mailto:ecartis@freelists.org?Subject=help> List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe> List-software: Ecartis version 1.0.0 List-Id: tarantool-patches <tarantool-patches.freelists.org> List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe> List-owner: <mailto:> List-post: <mailto:tarantool-patches@freelists.org> List-archive: <http://www.freelists.org/archives/tarantool-patches> To: alexander.turenko@tarantool.org Cc: tarantool-patches@freelists.org, avkhatskevich@tarantool.org --000000000000917c8305715848e2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =D1=81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 20:10, Alexander T= urenko < alexander.turenko@tarantool.org>: > Two minor comments are below. > > WBR, Alexander Turenko. > > On Wed, Jul 18, 2018 at 06:57:39PM +0300, Nikita Tatunov wrote: > > =D1=81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 18:53, Alex= Khatskevich > > <[1]avkhatskevich@tarantool.org>: > > > > Once you have fixed comments, please apply a new full diff at the en= d > > of email, to > > continue review process. > > Answer with a full diff to this email please (just paste as a plain > > text output of `git diff` command) > > On 18.07.2018 18:24, Nikita Tatunov wrote: > > > > -/* > > - * For LIKE and GLOB matching on EBCDIC machines, assume that every > > - * character is exactly one byte in size. Also, provde the > Utf8Read() > > - * macro for fast reading of the next character in the common case > > where > > - * the next character is ASCII. > > +/** > > + * Providing there are symbols in string s this macro returns > > + * UTF-8 code of character and pushes pointer to the next symbol > > + * in the string. Otherwise return code is SQL_END_OF_STRING. > > Nitpicking: pushes -> promotes? > > Both should be fine, I guess. But ok, changed it. > > */ > > -#define Utf8Read(s, e) ucnv_getNextUChar(pUtf8conv, &s, e, > &status) > > +#define Utf8Read(s, e) (s < e ?\ > > + ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0) > > + > > +#define SQL_END_OF_STRING 0 > > Always wrap argument usages of a function-like macro within its body > with parenthesis (consider example I give in the previous email). It is > usual way to handle the fact that a macros works as text replacement, > so, say > > #define SUB(x, y) x - y > > will give the wrong result for the expression SUB(10, 5 - 3), while > > #define SUB(x, y) ((x) - (y)) > > will produce the correct result. > Fixed. Also made few minor codestyle fixes in tests compared to prev. patch. diff --git a/src/box/sql/func.c b/src/box/sql/func.c index c06e3bd..a9784cc 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -617,13 +617,16 @@ struct compareInfo { u8 noCase; /* true to ignore case differences */ }; -/* - * For LIKE and GLOB matching on EBCDIC machines, assume that every - * character is exactly one byte in size. Also, provde the Utf8Read() - * macro for fast reading of the next character in the common case where - * the next character is ASCII. +/** + * Providing there are symbols in string s this + * macro returns UTF-8 code of character and + * promotes pointer to the next symbol in the string. + * Otherwise return code is SQL_END_OF_STRING. */ -#define Utf8Read(s, e) ucnv_getNextUChar(pUtf8conv, &s, e, &status) +#define Utf8Read(s, e) (((s) < (e)) ?\ + ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0) + +#define SQL_END_OF_STRING 0 --000000000000917c8305715848e2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div><br><div class=3D"gmail_quote"><div dir=3D"ltr">=D1= =81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 20:10, Alexander Ture= nko <<a href=3D"mailto:alexander.turenko@tarantool.org">alexander.turenk= o@tarantool.org</a>>:<br></div><blockquote class=3D"gmail_quote" style= =3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding= -left:1ex">Two minor comments are below.<br> <br> WBR, Alexander Turenko.<br> <br> On Wed, Jul 18, 2018 at 06:57:39PM +0300, Nikita Tatunov wrote:<br> >=C2=A0 =C2=A0 =D1=81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 = 18:53, Alex Khatskevich<br> >=C2=A0 =C2=A0 <[1]<a href=3D"mailto:avkhatskevich@tarantool.org" tar= get=3D"_blank">avkhatskevich@tarantool.org</a>>:<br> > <br> >=C2=A0 =C2=A0 Once you have fixed comments, please apply a new full dif= f at the end<br> >=C2=A0 =C2=A0 of email, to<br> >=C2=A0 =C2=A0 continue review process.<br> >=C2=A0 =C2=A0 Answer with a full diff to this email please (just paste = as a plain<br> >=C2=A0 =C2=A0 text output of `git diff` command)<br> >=C2=A0 =C2=A0 On 18.07.2018 18:24, Nikita Tatunov wrote:<br> > <br> >=C2=A0 =C2=A0 -/*<br> >=C2=A0 =C2=A0 - * For LIKE and GLOB matching on EBCDIC machines, assume= that every<br> >=C2=A0 =C2=A0 - * character is exactly one byte in size.=C2=A0 Also, pr= ovde the Utf8Read()<br> >=C2=A0 =C2=A0 - * macro for fast reading of the next character in the c= ommon case<br> >=C2=A0 =C2=A0 where<br> >=C2=A0 =C2=A0 - * the next character is ASCII.<br> >=C2=A0 =C2=A0 +/**<br> >=C2=A0 =C2=A0 + * Providing there are symbols in string s this macro re= turns<br> >=C2=A0 =C2=A0 + * UTF-8 code of character and pushes pointer to the nex= t symbol<br> >=C2=A0 =C2=A0 + * in the string. Otherwise return code is SQL_END_OF_ST= RING.<br> <br> Nitpicking: pushes -> promotes?<br> <br></blockquote><div><br></div><div>Both should be fine, I guess. But ok, = changed it.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style= =3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding= -left:1ex"> >=C2=A0 =C2=A0 =C2=A0 */<br> >=C2=A0 =C2=A0 -#define Utf8Read(s, e)=C2=A0 =C2=A0 ucnv_getNextUChar(pU= tf8conv, &s, e, &status)<br> >=C2=A0 =C2=A0 +#define Utf8Read(s, e) (s < e ?\<br> >=C2=A0 =C2=A0 + ucnv_getNextUChar(pUtf8conv, &s, e, &status) : = 0)<br> >=C2=A0 =C2=A0 +<br> >=C2=A0 =C2=A0 +#define SQL_END_OF_STRING=C2=A0 =C2=A0 =C2=A0 =C2=A00<br= > <br> Always wrap argument usages of a function-like macro within its body<br> with parenthesis (consider example I give in the previous email). It is<br> usual way to handle the fact that a macros works as text replacement,<br> so, say<br> <br> #define SUB(x, y) x - y<br> <br> will give the wrong result for the expression SUB(10, 5 - 3), while<br> <br> #define SUB(x, y) ((x) - (y))<br> <br> will produce the correct result.<br></blockquote><div><br></div><div>Fixed.= </div><div><br></div><div><span style=3D"font-size:small;background-color:r= gb(255,255,255);text-decoration-style:initial;text-decoration-color:initial= ;float:none;display:inline">Also made few minor codestyle fixes in tests co= mpared to prev. patch.</span><br class=3D"gmail-Apple-interchange-newline">= <br><br>=C2=A0diff --git a/src/box/sql/func.c b/src/box/sql/func.c</div><di= v>index c06e3bd..a9784cc 100644</div><div>--- a/src/box/sql/func.c</div><di= v>+++ b/src/box/sql/func.c</div><div>@@ -617,13 +617,16 @@ struct compareIn= fo {</div><div>=C2=A0<span style=3D"white-space:pre"> </span>u8 noCase;<spa= n style=3D"white-space:pre"> </span>/* true to ignore case differences */<= /div><div>=C2=A0};</div><div>=C2=A0</div><div>-/*</div><div>- * For LIKE an= d GLOB matching on EBCDIC machines, assume that every</div><div>- * charact= er is exactly one byte in size.=C2=A0 Also, provde the Utf8Read()</div><div= >- * macro for fast reading of the next character in the common case where<= /div><div>- * the next character is ASCII.</div><div>+/**</div><div>+ * Pro= viding there are symbols in string s this</div><div>+ * macro returns UTF-8= code of character and</div><div>+ * promotes pointer to the next symbol in= the string.=C2=A0</div><div>+ * Otherwise return code is SQL_END_OF_STRING= .</div><div>=C2=A0 */</div><div>-#define Utf8Read(s, e)=C2=A0 =C2=A0 ucnv_g= etNextUChar(pUtf8conv, &s, e, &status)</div><div>+#define Utf8Read(= s, e) (((s) < (e)) ?\</div><div>+<span style=3D"white-space:pre"> </span= >ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0)</div><div>+</div= ><div>+#define SQL_END_OF_STRING=C2=A0 =C2=A0 =C2=A0 =C2=A00</div><div>=C2= =A0<br></div></div></div></div> --000000000000917c8305715848e2--