From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounce@freelists.org>
Received: from localhost (localhost [127.0.0.1])
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 09B3922E09
	for <tarantool-patches@freelists.org>; Thu, 19 Jul 2018 07:14:34 -0400 (EDT)
Received: from turing.freelists.org ([127.0.0.1])
	by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id p1MOHxrlO9PC for <tarantool-patches@freelists.org>;
	Thu, 19 Jul 2018 07:14:33 -0400 (EDT)
Received: from mail-lf0-f67.google.com (mail-lf0-f67.google.com [209.85.215.67])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 7D0F021F7B
	for <tarantool-patches@freelists.org>; Thu, 19 Jul 2018 07:14:33 -0400 (EDT)
Received: by mail-lf0-f67.google.com with SMTP id a4-v6so5745140lff.5
        for <tarantool-patches@freelists.org>; Thu, 19 Jul 2018 04:14:33 -0700 (PDT)
MIME-Version: 1.0
References: <1530190036-10105-1-git-send-email-hollow653@gmail.com>
 <20180718024314.be245cmsgklxuvnk@tkn_work_nb> <CAEi+_apXUVK8QBrBDm05JYcNxYfB43R0vvsQ_2LHK5j+tv25tg@mail.gmail.com>
 <dcf5206b-892e-2def-a88a-8c8d6507dc57@tarantool.org> <CAEi+_aq3ibC1=3sN4On=P9rHOEB=mZexKEMo5rf1t=3aa6MBTg@mail.gmail.com>
 <20180718171024.ry2cltazdl2hsrpc@tkn_work_nb>
In-Reply-To: <20180718171024.ry2cltazdl2hsrpc@tkn_work_nb>
From: Nikita Tatunov <hollow653@gmail.com>
Date: Thu, 19 Jul 2018 14:14:20 +0300
Message-ID: <CAEi+_aqNw=POs+A6e6jNhxR7Yb7Pgij9gd9df4TyX4iiSXG30A@mail.gmail.com>
Subject: [tarantool-patches] Re: [PATCH] sql: LIKE & GLOB pattern comparison issue
Content-Type: multipart/alternative; boundary="000000000000917c8305715848e2"
Sender: tarantool-patches-bounce@freelists.org
Errors-to: tarantool-patches-bounce@freelists.org
Reply-To: tarantool-patches@freelists.org
List-help: <mailto:ecartis@freelists.org?Subject=help>
List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: tarantool-patches <tarantool-patches.freelists.org>
List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe>
List-owner: <mailto:>
List-post: <mailto:tarantool-patches@freelists.org>
List-archive: <http://www.freelists.org/archives/tarantool-patches>
To: alexander.turenko@tarantool.org
Cc: tarantool-patches@freelists.org, avkhatskevich@tarantool.org

--000000000000917c8305715848e2
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

=D1=81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 20:10, Alexander T=
urenko <
alexander.turenko@tarantool.org>:

> Two minor comments are below.
>
> WBR, Alexander Turenko.
>
> On Wed, Jul 18, 2018 at 06:57:39PM +0300, Nikita Tatunov wrote:
> >    =D1=81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 18:53, Alex=
 Khatskevich
> >    <[1]avkhatskevich@tarantool.org>:
> >
> >    Once you have fixed comments, please apply a new full diff at the en=
d
> >    of email, to
> >    continue review process.
> >    Answer with a full diff to this email please (just paste as a plain
> >    text output of `git diff` command)
> >    On 18.07.2018 18:24, Nikita Tatunov wrote:
> >
> >    -/*
> >    - * For LIKE and GLOB matching on EBCDIC machines, assume that every
> >    - * character is exactly one byte in size.  Also, provde the
> Utf8Read()
> >    - * macro for fast reading of the next character in the common case
> >    where
> >    - * the next character is ASCII.
> >    +/**
> >    + * Providing there are symbols in string s this macro returns
> >    + * UTF-8 code of character and pushes pointer to the next symbol
> >    + * in the string. Otherwise return code is SQL_END_OF_STRING.
>
> Nitpicking: pushes -> promotes?
>
>
Both should be fine, I guess. But ok, changed it.


> >      */
> >    -#define Utf8Read(s, e)    ucnv_getNextUChar(pUtf8conv, &s, e,
> &status)
> >    +#define Utf8Read(s, e) (s < e ?\
> >    + ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0)
> >    +
> >    +#define SQL_END_OF_STRING       0
>
> Always wrap argument usages of a function-like macro within its body
> with parenthesis (consider example I give in the previous email). It is
> usual way to handle the fact that a macros works as text replacement,
> so, say
>
> #define SUB(x, y) x - y
>
> will give the wrong result for the expression SUB(10, 5 - 3), while
>
> #define SUB(x, y) ((x) - (y))
>
> will produce the correct result.
>

Fixed.

Also made few minor codestyle fixes in tests compared to prev. patch.


 diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index c06e3bd..a9784cc 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -617,13 +617,16 @@ struct compareInfo {
  u8 noCase; /* true to ignore case differences */
 };

-/*
- * For LIKE and GLOB matching on EBCDIC machines, assume that every
- * character is exactly one byte in size.  Also, provde the Utf8Read()
- * macro for fast reading of the next character in the common case where
- * the next character is ASCII.
+/**
+ * Providing there are symbols in string s this
+ * macro returns UTF-8 code of character and
+ * promotes pointer to the next symbol in the string.
+ * Otherwise return code is SQL_END_OF_STRING.
  */
-#define Utf8Read(s, e)    ucnv_getNextUChar(pUtf8conv, &s, e, &status)
+#define Utf8Read(s, e) (((s) < (e)) ?\
+ ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0)
+
+#define SQL_END_OF_STRING       0

--000000000000917c8305715848e2
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><br><div class=3D"gmail_quote"><div dir=3D"ltr">=D1=
=81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 20:10, Alexander Ture=
nko &lt;<a href=3D"mailto:alexander.turenko@tarantool.org">alexander.turenk=
o@tarantool.org</a>&gt;:<br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex">Two minor comments are below.<br>
<br>
WBR, Alexander Turenko.<br>
<br>
On Wed, Jul 18, 2018 at 06:57:39PM +0300, Nikita Tatunov wrote:<br>
&gt;=C2=A0 =C2=A0 =D1=81=D1=80, 18 =D0=B8=D1=8E=D0=BB. 2018 =D0=B3. =D0=B2 =
18:53, Alex Khatskevich<br>
&gt;=C2=A0 =C2=A0 &lt;[1]<a href=3D"mailto:avkhatskevich@tarantool.org" tar=
get=3D"_blank">avkhatskevich@tarantool.org</a>&gt;:<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 Once you have fixed comments, please apply a new full dif=
f at the end<br>
&gt;=C2=A0 =C2=A0 of email, to<br>
&gt;=C2=A0 =C2=A0 continue review process.<br>
&gt;=C2=A0 =C2=A0 Answer with a full diff to this email please (just paste =
as a plain<br>
&gt;=C2=A0 =C2=A0 text output of `git diff` command)<br>
&gt;=C2=A0 =C2=A0 On 18.07.2018 18:24, Nikita Tatunov wrote:<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 -/*<br>
&gt;=C2=A0 =C2=A0 - * For LIKE and GLOB matching on EBCDIC machines, assume=
 that every<br>
&gt;=C2=A0 =C2=A0 - * character is exactly one byte in size.=C2=A0 Also, pr=
ovde the Utf8Read()<br>
&gt;=C2=A0 =C2=A0 - * macro for fast reading of the next character in the c=
ommon case<br>
&gt;=C2=A0 =C2=A0 where<br>
&gt;=C2=A0 =C2=A0 - * the next character is ASCII.<br>
&gt;=C2=A0 =C2=A0 +/**<br>
&gt;=C2=A0 =C2=A0 + * Providing there are symbols in string s this macro re=
turns<br>
&gt;=C2=A0 =C2=A0 + * UTF-8 code of character and pushes pointer to the nex=
t symbol<br>
&gt;=C2=A0 =C2=A0 + * in the string. Otherwise return code is SQL_END_OF_ST=
RING.<br>
<br>
Nitpicking: pushes -&gt; promotes?<br>
<br></blockquote><div><br></div><div>Both should be fine, I guess. But ok, =
changed it.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex">
&gt;=C2=A0 =C2=A0 =C2=A0 */<br>
&gt;=C2=A0 =C2=A0 -#define Utf8Read(s, e)=C2=A0 =C2=A0 ucnv_getNextUChar(pU=
tf8conv, &amp;s, e, &amp;status)<br>
&gt;=C2=A0 =C2=A0 +#define Utf8Read(s, e) (s &lt; e ?\<br>
&gt;=C2=A0 =C2=A0 + ucnv_getNextUChar(pUtf8conv, &amp;s, e, &amp;status) : =
0)<br>
&gt;=C2=A0 =C2=A0 +<br>
&gt;=C2=A0 =C2=A0 +#define SQL_END_OF_STRING=C2=A0 =C2=A0 =C2=A0 =C2=A00<br=
>
<br>
Always wrap argument usages of a function-like macro within its body<br>
with parenthesis (consider example I give in the previous email). It is<br>
usual way to handle the fact that a macros works as text replacement,<br>
so, say<br>
<br>
#define SUB(x, y) x - y<br>
<br>
will give the wrong result for the expression SUB(10, 5 - 3), while<br>
<br>
#define SUB(x, y) ((x) - (y))<br>
<br>
will produce the correct result.<br></blockquote><div><br></div><div>Fixed.=
</div><div><br></div><div><span style=3D"font-size:small;background-color:r=
gb(255,255,255);text-decoration-style:initial;text-decoration-color:initial=
;float:none;display:inline">Also made few minor codestyle fixes in tests co=
mpared to prev. patch.</span><br class=3D"gmail-Apple-interchange-newline">=
<br><br>=C2=A0diff --git a/src/box/sql/func.c b/src/box/sql/func.c</div><di=
v>index c06e3bd..a9784cc 100644</div><div>--- a/src/box/sql/func.c</div><di=
v>+++ b/src/box/sql/func.c</div><div>@@ -617,13 +617,16 @@ struct compareIn=
fo {</div><div>=C2=A0<span style=3D"white-space:pre">	</span>u8 noCase;<spa=
n style=3D"white-space:pre">		</span>/* true to ignore case differences */<=
/div><div>=C2=A0};</div><div>=C2=A0</div><div>-/*</div><div>- * For LIKE an=
d GLOB matching on EBCDIC machines, assume that every</div><div>- * charact=
er is exactly one byte in size.=C2=A0 Also, provde the Utf8Read()</div><div=
>- * macro for fast reading of the next character in the common case where<=
/div><div>- * the next character is ASCII.</div><div>+/**</div><div>+ * Pro=
viding there are symbols in string s this</div><div>+ * macro returns UTF-8=
 code of character and</div><div>+ * promotes pointer to the next symbol in=
 the string.=C2=A0</div><div>+ * Otherwise return code is SQL_END_OF_STRING=
.</div><div>=C2=A0 */</div><div>-#define Utf8Read(s, e)=C2=A0 =C2=A0 ucnv_g=
etNextUChar(pUtf8conv, &amp;s, e, &amp;status)</div><div>+#define Utf8Read(=
s, e) (((s) &lt; (e)) ?\</div><div>+<span style=3D"white-space:pre">	</span=
>ucnv_getNextUChar(pUtf8conv, &amp;s, e, &amp;status) : 0)</div><div>+</div=
><div>+#define SQL_END_OF_STRING=C2=A0 =C2=A0 =C2=A0 =C2=A00</div><div>=C2=
=A0<br></div></div></div></div>

--000000000000917c8305715848e2--