From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounce@freelists.org>
Received: from localhost (localhost [127.0.0.1])
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 507E6287EF
	for <tarantool-patches@freelists.org>; Wed,  1 Aug 2018 09:56:44 -0400 (EDT)
Received: from turing.freelists.org ([127.0.0.1])
	by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id 66sSuaWTYWEx for <tarantool-patches@freelists.org>;
	Wed,  1 Aug 2018 09:56:44 -0400 (EDT)
Received: from smtp44.i.mail.ru (smtp44.i.mail.ru [94.100.177.104])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 2B9ED28396
	for <tarantool-patches@freelists.org>; Wed,  1 Aug 2018 09:56:43 -0400 (EDT)
Subject: [tarantool-patches] Re: [PATCH] sql: LIKE & GLOB pattern
 comparison issue
References: <20180718024314.be245cmsgklxuvnk@tkn_work_nb>
 <CAEi+_apXUVK8QBrBDm05JYcNxYfB43R0vvsQ_2LHK5j+tv25tg@mail.gmail.com>
 <dcf5206b-892e-2def-a88a-8c8d6507dc57@tarantool.org>
 <CAEi+_aq3ibC1=3sN4On=P9rHOEB=mZexKEMo5rf1t=3aa6MBTg@mail.gmail.com>
 <fe7d0ecd-55e4-40df-554a-a885e234271d@tarantool.org>
 <CAEi+_aoW02p5fUH=HZEPoSa2ebMa9meT=Z6NB73kXCLmzqQ7xw@mail.gmail.com>
 <20180727130601.b2oby7dleapd5upg@tkn_work_nb>
 <CAEi+_arMh=rLm7jNAX-ERHjf6GQ5D7RqQK=+wSXCwJjZouoyeA@mail.gmail.com>
 <20180727202219.ikwbax7tysfnmgr4@tkn_work_nb>
 <CAEi+_aqvD4fQxvGNLf1J_TvRmDL-JwQ=e0LJJb=xLCoQLCUu3w@mail.gmail.com>
 <20180731134705.3pij4hwyyirhiwr7@tkn_work_nb>
 <CAEi+_aoLMp0nEYfxNGnf4LZdRCDfpgE0PfFByKxM=wOR374__g@mail.gmail.com>
 <CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com>
From: Alex Khatskevich <avkhatskevich@tarantool.org>
Message-ID: <f3f65411-157c-fcc2-2e77-4e9764291de5@tarantool.org>
Date: Wed, 1 Aug 2018 16:56:34 +0300
MIME-Version: 1.0
In-Reply-To: <CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com>
Content-Type: multipart/alternative;
 boundary="------------B332719D4B32C324E9E80FCF"
Content-Language: en-US
Sender: tarantool-patches-bounce@freelists.org
Errors-to: tarantool-patches-bounce@freelists.org
Reply-To: tarantool-patches@freelists.org
List-help: <mailto:ecartis@freelists.org?Subject=help>
List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: tarantool-patches <tarantool-patches.freelists.org>
List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe>
List-owner: <mailto:>
List-post: <mailto:tarantool-patches@freelists.org>
List-archive: <http://www.freelists.org/archives/tarantool-patches>
To: Nikita Tatunov <hollow653@gmail.com>, Alexander Turenko <alexander.turenko@tarantool.org>
Cc: tarantool-patches@freelists.org

This is a multi-part message in MIME format.
--------------B332719D4B32C324E9E80FCF
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit


On 01.08.2018 13:51, Nikita Tatunov wrote:
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index c06e3bd..7f93ef6 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -617,13 +617,17 @@ struct compareInfo {
> u8 noCase;/* true to ignore case differences */
>  };
> -/*
> - * For LIKE and GLOB matching on EBCDIC machines, assume that every
> - * character is exactly one byte in size.  Also, provde the Utf8Read()
> - * macro for fast reading of the next character in the common case where
> - * the next character is ASCII.
> +/**
> + * Providing there are symbols in string s this
> + * macro returns UTF-8 code of character and
> + * promotes pointer to the next symbol in the string.
> + * Otherwise return code is SQL_END_OF_STRING.
>   */
> -#define Utf8Read(s, e)    ucnv_getNextUChar(pUtf8conv, &s, e, &status)
> +#define Utf8Read(s, e) (((s) < (e)) ? \
> +ucnv_getNextUChar(pUtf8conv, &(s), (e), &(status)) : 0)
[Later I will ask you to return this macro back, so, you may not do this]
As I understand, you are returning `0` from Utf8Read in case of end of 
the string.
Let's return `SQL_END_OF_STRING` instead of just `0`.
> +
> +#define SQL_END_OF_STRING        0
> +#define SQL_INVALID_UTF8_SYMBOL  0xfffd
>  static const struct compareInfo globInfo = { '*', '?', '[', 0 };
> @@ -638,19 +642,16 @@ static const struct compareInfo likeInfoNorm = { 
> '%', '_', 0, 1 };
>  static const struct compareInfo likeInfoAlt = { '%', '_', 0, 0 };
>  /*
> - * Possible error returns from patternMatch()
> + * Possible error returns from sql_utf8_pattern_compare()
>   */
>  #define SQLITE_MATCH             0
>  #define SQLITE_NOMATCH           1
>  #define SQLITE_NOWILDCARDMATCH   2
> +#define SQL_PROHIBITED_PATTERN   3
I am not sure that the invalid (with invalid symbols) pattern can be 
called `prohibited`.
Rename somehow? My proposal: SQL_INVALID_PATTERN.
Moreover, You have named this definition with the `SQL` prefix, which is 
good, however,
similar definitions are still prefixed with `SQLITE`. I would like you 
to rename those in
this (preferred) or in a separate commit for consistency.
> -/*
> - * Compare two UTF-8 strings for equality where the first string is
> - * a GLOB or LIKE expression.  Return values:
> - *
> - *    SQLITE_MATCH:            Match
> - *    SQLITE_NOMATCH:          No match
> - *    SQLITE_NOWILDCARDMATCH:  No match in spite of having * or % 
> wildcards.
> +/**
> + * Compare two UTF-8 strings for equality where the first string
> + * is a GLOB or LIKE expression.
>   *
>   * Globbing rules:
>   *
> @@ -663,92 +664,136 @@ static const struct compareInfo likeInfoAlt = { 
> '%', '_', 0, 0 };
>   *
>   *     [^...]     Matches one character not in the enclosed list.
>   *
> - * With the [...] and [^...] matching, a ']' character can be included
> - * in the list by making it the first character after '[' or '^'.  A
> - * range of characters can be specified using '-'. Example:
> - * "[a-z]" matches any single lower-case letter.  To match a '-', make
> - * it the last character in the list.
> + * With the [...] and [^...] matching, a ']' character can be
> + * included in the list by making it the first character after
> + * '[' or '^'. A range of characters can be specified using '-'.
> + * Example: "[a-z]" matches any single lower-case letter.
> + * To match a '-', make it the last character in the list.
Does it work for UTF characters? I suppose no.
Let's write about it here + let's file an issue to make it
work with UTF characters.
>   *
>   * Like matching rules:
>   *
> - *      '%'       Matches any sequence of zero or more characters
> + *      '%'       Matches any sequence of zero or more characters.
>   *
> - **     '_'       Matches any one character
> + **     '_'       Matches any one character.
>   *
>   *      Ec        Where E is the "esc" character and c is any other
> - *                character, including '%', '_', and esc, match 
> exactly c.
> + *                character, including '%', '_', and esc, match
> + *                exactly c.
>   *
>   * The comments within this routine usually assume glob matching.
>   *
> - * This routine is usually quick, but can be N**2 in the worst case.
> + * This routine is usually quick, but can be N**2 in the worst
> + * case.
> + *
> + * @param pattern String containing comparison pattern.
> + * @param string String being compared.
> + * @param compareInfo Information about how to compare.
> + * @param matchOther The escape char (LIKE) or '[' (GLOB).
> + *
> + * @retval SQLITE_MATCH:            Match.
> + * SQLITE_NOMATCH:          No match.
> + * SQLITE_NOWILDCARDMATCH:  No match in spite of having *
> + *    or % wildcards.
> + * SQL_PROHIBITED_PATTERN:  Pattern contains invalid
> + *    symbol.
Minor: It is not very good that you use symbol and character 
interchangeably.
I suppose that `character` should be used everywhere.
>   */
>  static int
> -patternCompare(const char * pattern,/* The glob pattern */
> -       const char * string,/* The string to compare against the glob */
> -       const struct compareInfo *pInfo,/* Information about how to do 
> the compare */
> -       UChar32 matchOther/* The escape char (LIKE) or '[' (GLOB) */
> -    )
> +sql_utf8_pattern_compare(const char * pattern,
> +const char * string,
> +const struct compareInfo *pInfo,
> +UChar32 matchOther)
"star" sign should stick to the attribute  name.
https://tarantool.io/en/doc/1.9/dev_guide/c_style_guide/#chapter-3-1-spaces

To prevent such typos in the future, you can use special perl
script which checks coding style in Linux kernel patches.
>  {
> -UChar32 c, c2;/* Next pattern and input string chars */
> -UChar32 matchOne = pInfo->matchOne;/* "?" or "_" */
> -UChar32 matchAll = pInfo->matchAll;/* "*" or "%" */
> -UChar32 noCase = pInfo->noCase;/* True if uppercase==lowercase */
> -const char *zEscaped = 0;/* One past the last escaped input char */
> +/* Next pattern and input string chars */
> +UChar32 c, c2;
> +/* "?" or "_" */
> +UChar32 matchOne = pInfo->matchOne;
> +/* "*" or "%" */
> +UChar32 matchAll = pInfo->matchAll;
> +/* True if uppercase==lowercase */
> +UChar32 noCase = pInfo->noCase;
> +/* One past the last escaped input char */
> +const char *zEscaped = 0;
> const char * pattern_end = pattern + strlen(pattern);
> const char * string_end = string + strlen(string);
> UErrorCode status = U_ZERO_ERROR;
> -while (pattern < pattern_end){
> -c = Utf8Read(pattern, pattern_end);
> +while ((c = Utf8Read(pattern, pattern_end)) != SQL_END_OF_STRING) {
REMEMBER THIS POINT #1
> +if (c == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> if (c == matchAll) {/* Match "*" */
> -/* Skip over multiple "*" characters in the pattern.  If there
> -* are also "?" characters, skip those as well, but consume a
> -* single character of the input string for each "?" skipped
> +/* Skip over multiple "*" characters in
> +* the pattern. If there are also "?"
> +* characters, skip those as well, but
> +* consume a single character of the
> +* input string for each "?" skipped.
> */
> -while (pattern < pattern_end){
> -c = Utf8Read(pattern, pattern_end);
> +while ((c = Utf8Read(pattern, pattern_end)) !=
> + SQL_END_OF_STRING) {
> +if (c == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> if (c != matchAll && c != matchOne)
> break;
> -if (c == matchOne
> -    && Utf8Read(string, string_end) == 0) {
> +if (c == matchOne &&
> +    (c2 = Utf8Read(string, string_end)) ==
> +SQL_END_OF_STRING)
> return SQLITE_NOWILDCARDMATCH;
> -}
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQLITE_NOMATCH;
> }
> -/* "*" at the end of the pattern matches */
> -if (pattern == pattern_end)
> +/*
> +* "*" at the end of the pattern matches.
> +*/
> +if (c == SQL_END_OF_STRING) {
> +while ((c2 = Utf8Read(string, string_end)) !=
> + SQL_END_OF_STRING)
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQLITE_NOMATCH;
> return SQLITE_MATCH;
> +}
> if (c == matchOther) {
> if (pInfo->matchSet == 0) {
> c = Utf8Read(pattern, pattern_end);
> -if (c == 0)
> +if (c == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> +if (c == SQL_END_OF_STRING)
> return SQLITE_NOWILDCARDMATCH;
> } else {
> -/* "[...]" immediately follows the "*".  We have to do a slow
> -* recursive search in this case, but it is an unusual case.
> +/* "[...]" immediately
> +* follows the "*". We
> +* have to do a slow
> +* recursive search in
> +* this case, but it is
> +* an unusual case.
> */
> -assert(matchOther < 0x80);/* '[' is a single-byte character */
> +assert(matchOther < 0x80);
> while (string < string_end) {
REMEMBER THIS POINT #2

> int bMatch =
> -patternCompare(&pattern[-1],
> -   string,
> -   pInfo,
> - matchOther);
> +sql_utf8_pattern_compare(
> +&pattern[-1],
> +string,
> +pInfo,
> +matchOther);
> if (bMatch != SQLITE_NOMATCH)
> return bMatch;
> -Utf8Read(string, string_end);
> +c = Utf8Read(string, string_end);
> +if (c == SQL_INVALID_UTF8_SYMBOL)
> +return SQLITE_NOMATCH;
look at <REMEMBER THIS POINT #1,2> and other `Utf8Read` usages.
You have introduced SQL_END_OF_STRING and changed `Utf8Read` pattern to 
use it in
half of cases?

Moreover,in that place you do check `string < string_end` implicitly 
inside of
`Utf8Read` but you never use that result.

I suppose you should return old iteration style and `Utf8Read` macro.
```
while (string < string_end) {
     c = Utf8Read(string, string_end);
if (c == SQL_INVALID_UTF8_SYMBOL)
return SQLITE_NOMATCH;
```
> }
> return SQLITE_NOWILDCARDMATCH;
> }
> }
> -/* At this point variable c contains the first character of the
> -* pattern string past the "*".  Search in the input string for the
> -* first matching character and recursively continue the match from
> -* that point.
> +/* At this point variable c contains the
> +* first character of the pattern string
> +* past the "*". Search in the input
> +* string for the first matching
> +* character and recursively continue the
> +* match from that point.
> *
> -* For a case-insensitive search, set variable cx to be the same as
> -* c but in the other case and search the input string for either
> -* c or cx.
> +* For a case-insensitive search, set
> +* variable cx to be the same as c but in
> +* the other case and search the input
> +* string for either c or cx.
> */
> int bMatch;
> @@ -756,14 +801,18 @@ patternCompare(const char * pattern,/* The glob 
> pattern */
> c = u_tolower(c);
> while (string < string_end){
> /**
> -* This loop could have been implemented
> -* without if converting c2 to lower case
> -* (by holding c_upper and c_lower), however
> -* it is implemented this way because lower
> -* works better with German and Turkish
> -* languages.
> +* This loop could have been
> +* implemented without if
> +* converting c2 to lower case
> +* by holding c_upper and
> +* c_lower,however it is
> +* implemented this way because
> +* lower works better with German
> +* and Turkish languages.
> */
> c2 = Utf8Read(string, string_end);
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQLITE_NOMATCH;
> if (!noCase) {
> if (c2 != c)
> continue;
> @@ -771,9 +820,10 @@ patternCompare(const char * pattern,/* The glob 
> pattern */
> if (c2 != c && u_tolower(c2) != c)
> continue;
> }
> -bMatch =
> -patternCompare(pattern, string,
> -   pInfo, matchOther);
> +bMatch = sql_utf8_pattern_compare(pattern,
> +  string,
> +  pInfo,
> +matchOther);
> if (bMatch != SQLITE_NOMATCH)
> return bMatch;
> }
> @@ -782,7 +832,9 @@ patternCompare(const char * pattern,/* The glob 
> pattern */
> if (c == matchOther) {
> if (pInfo->matchSet == 0) {
> c = Utf8Read(pattern, pattern_end);
> -if (c == 0)
> +if (c == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> +if (c == SQL_END_OF_STRING)
> return SQLITE_NOMATCH;
> zEscaped = pattern;
> } else {
> @@ -790,23 +842,33 @@ patternCompare(const char * pattern,/* The glob 
> pattern */
> int seen = 0;
> int invert = 0;
> c = Utf8Read(string, string_end);
> +if (c == SQL_INVALID_UTF8_SYMBOL)
> +return SQLITE_NOMATCH;
> if (string == string_end)
> return SQLITE_NOMATCH;
> c2 = Utf8Read(pattern, pattern_end);
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> if (c2 == '^') {
> invert = 1;
> c2 = Utf8Read(pattern, pattern_end);
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> }
> if (c2 == ']') {
> if (c == ']')
> seen = 1;
> c2 = Utf8Read(pattern, pattern_end);
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> }
> -while (c2 && c2 != ']') {
> +while (c2 != SQL_END_OF_STRING && c2 != ']') {
> if (c2 == '-' && pattern[0] != ']'
>     && pattern < pattern_end
>     && prior_c > 0) {
> c2 = Utf8Read(pattern, pattern_end);
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> if (c >= prior_c && c <= c2)
> seen = 1;
> prior_c = 0;
> @@ -817,29 +879,36 @@ patternCompare(const char * pattern,/* The glob 
> pattern */
> prior_c = c2;
> }
> c2 = Utf8Read(pattern, pattern_end);
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQL_PROHIBITED_PATTERN;
> }
> -if (pattern == pattern_end || (seen ^ invert) == 0) {
> +if (pattern == pattern_end ||
> +    (seen ^ invert) == 0) {
> return SQLITE_NOMATCH;
> }
> continue;
> }
> }
> c2 = Utf8Read(string, string_end);
> +if (c2 == SQL_INVALID_UTF8_SYMBOL)
> +return SQLITE_NOMATCH;
> if (c == c2)
> continue;
> if (noCase){
> /**
> -* Small optimisation. Reduce number of calls
> -* to u_tolower function.
> -* SQL standards suggest use to_upper for symbol
> -* normalisation. However, using to_lower allows to
> -* respect Turkish 'İ' in default locale.
> +* Small optimisation. Reduce number of
> +* calls to u_tolower function. SQL
> +* standards suggest use to_upper for
> +* symbol normalisation. However, using
> +* to_lower allows to respect Turkish 'İ'
> +* in default locale.
> */
> if (u_tolower(c) == c2 ||
>     c == u_tolower(c2))
> continue;
> }
> -if (c == matchOne && pattern != zEscaped && c2 != 0)
> +if (c == matchOne && pattern != zEscaped &&
> +    c2 != SQL_END_OF_STRING)
> continue;
> return SQLITE_NOMATCH;
> }
> @@ -853,8 +922,7 @@ patternCompare(const char * pattern,/* The glob 
> pattern */
>  int
>  sqlite3_strglob(const char *zGlobPattern, const char *zString)
>  {
> -return patternCompare(zGlobPattern, zString, &globInfo,
> -      '[');
> +return sql_utf8_pattern_compare(zGlobPattern, zString, &globInfo, '[');
>  }
>  /*
> @@ -864,7 +932,7 @@ sqlite3_strglob(const char *zGlobPattern, const 
> char *zString)
>  int
>  sqlite3_strlike(const char *zPattern, const char *zStr, unsigned int esc)
>  {
> -return patternCompare(zPattern, zStr, &likeInfoNorm, esc);
> +return sql_utf8_pattern_compare(zPattern, zStr, &likeInfoNorm, esc);
>  }
>  /*
> @@ -910,8 +978,9 @@ likeFunc(sqlite3_context * context, int argc, 
> sqlite3_value ** argv)
> zB = (const char *) sqlite3_value_text(argv[0]);
> zA = (const char *) sqlite3_value_text(argv[1]);
> -/* Limit the length of the LIKE or GLOB pattern to avoid problems
> -* of deep recursion and N*N behavior in patternCompare().
> +/* Limit the length of the LIKE or GLOB pattern to avoid
> +* problems of deep recursion and N*N behavior in
> +* sql_utf8_pattern_compare().
> */
> nPat = sqlite3_value_bytes(argv[0]);
> testcase(nPat == db->aLimit[SQLITE_LIMIT_LIKE_PATTERN_LENGTH]);
> @@ -947,7 +1016,12 @@ likeFunc(sqlite3_context * context, int argc, 
> sqlite3_value ** argv)
> sqlite3_like_count++;
>  #endif
> int res;
> -res = patternCompare(zB, zA, pInfo, escape);
> +res = sql_utf8_pattern_compare(zB, zA, pInfo, escape);
> +if (res == SQL_PROHIBITED_PATTERN) {
> +sqlite3_result_error(context, "LIKE or GLOB pattern can only"
> +     " contain UTF-8 characters", -1);
> +return;
> +}
> sqlite3_result_int(context, res == SQLITE_MATCH);
>  }
> diff --git a/test-run b/test-run
> index 77e9327..95562e9 160000
> --- a/test-run
> +++ b/test-run
> @@ -1 +1 @@
> -Subproject commit 77e93279210f8c5c1fd0ed03416fa19a184f0b6d
> +Subproject commit 95562e95401fef4e0b755ab0bb430974b5d1a29a
> diff --git a/test/sql-tap/e_expr.test.lua b/test/sql-tap/e_expr.test.lua
> index 13d3a96..9780d2c 100755
> --- a/test/sql-tap/e_expr.test.lua
> +++ b/test/sql-tap/e_expr.test.lua
> @@ -1,6 +1,6 @@
>  #!/usr/bin/env tarantool
>  test = require("sqltester")
> -test:plan(12431)
> +test:plan(10665)
>  --!./tcltestrunner.lua
>  -- 2010 July 16
> @@ -77,8 +77,10 @@ local operations = {
>      {"<>", "ne1"},
>      {"!=", "ne2"},
>      {"IS", "is"},
> -    {"LIKE", "like"},
> -    {"GLOB", "glob"},
> +-- NOTE: This test needs refactoring after deletion of GLOB &
> +--type restrictions for LIKE. (See #3572)
> +--    {"LIKE", "like"},
> +--    {"GLOB", "glob"},
Yes, this behavior is not valid anymore.
To make sure that likes and globs will be tested in the future, please, 
delete this
commented lines and add your own simple test, which tries to call `like` 
and `glob`
with inappropriate types.
It is important to have a functional tests for any possible behavior.
>      {"AND", "and"},
>      {"OR", "or"},
>      {"MATCH", "match"},
> @@ -96,7 +98,12 @@ operations = {
>      {"+", "-"},
>      {"<<", ">>", "&", "|"},
>      {"<", "<=", ">", ">="},
> -    {"=", "==", "!=", "<>", "LIKE", "GLOB"}, --"MATCH", "REGEXP"},
> +-- NOTE: This test needs refactoring after deletion of GLOB &
> +--type restrictions for LIKE. (See #3572)
> +-- Another NOTE: MATCH & REGEXP aren't supported in Tarantool &
> +-- are waiting for their hour, don't confuse them
> +--being commented with ticket above.
> +    {"=", "==", "!=", "<>"}, --"LIKE", "GLOB"}, --"MATCH", "REGEXP"},
>      {"AND"},
>      {"OR"},
>  }
> @@ -475,6 +482,7 @@ for _, op in ipairs(oplist) do
>          end
>      end
>  end
> +
>  ---------------------------------------------------------------------------
>  -- Test the IS and IS NOT operators.
>  --
> diff --git a/test/sql-tap/gh-3251-string-pattern-comparison.test.lua 
> b/test/sql-tap/gh-3251-string-pattern-comparison.test.lua
> new file mode 100755
> index 0000000..2a787f2
> --- /dev/null
> +++ b/test/sql-tap/gh-3251-string-pattern-comparison.test.lua
> @@ -0,0 +1,213 @@
> +#!/usr/bin/env tarantool
> +test = require("sqltester")
> +test:plan(128)
> +
> +local prefix = "like-test-"
> +
> +-- Unicode byte sequences.
> +local valid_testcases = {
> +    '\x01',
> +    '\x09',
> +    '\x1F',
> +    '\x7F',
> +    '\xC2\x80',
> +    '\xC2\x90',
> +    '\xC2\x9F',
> +    '\xE2\x80\xA8',
> +    '\x20\x0B',
> +    '\xE2\x80\xA9',
> +}
optional: add descriptions to those byte sequences (what it is).
> +
> +-- Non-Unicode byte sequences.
> +local invalid_testcases = {
> +    '\xE2\x80',
> +    '\xFE\xFF',
> +    '\xC2',
> +    '\xED\xB0\x80',
> +    '\xD0',
> +}
Place that after like_test_cases, just before it is used.
> +
> +local like_test_cases =
> +{
> +    {"1.1",
> +        "SELECT 'AB' LIKE '_B';",
> +        {0, {1}} },
> +    {"1.2",
> +        "SELECT 'CD' LIKE '_B';",
> +        {0, {0}} },
> +    {"1.3",
> +        "SELECT '' LIKE '_B';",
> +        {0, {0}} },
> +    {"1.4",
> +        "SELECT 'AB' LIKE '%B';",
> +        {0, {1}} },
> +    {"1.5",
> +        "SELECT 'CD' LIKE '%B';",
> +        {0, {0}} },
> +    {"1.6",
> +        "SELECT '' LIKE '%B';",
> +        {0, {0}} },
> +    {"1.7",
> +        "SELECT 'AB' LIKE 'A__';",
> +        {0, {0}} },
> +    {"1.8",
> +        "SELECT 'CD' LIKE 'A__';",
> +        {0, {0}} },
> +    {"1.9",
> +        "SELECT '' LIKE 'A__';",
> +        {0, {0}} },
> +    {"1.10",
> +        "SELECT 'AB' LIKE 'A_';",
> +        {0, {1}} },
> +    {"1.11",
> +        "SELECT 'CD' LIKE 'A_';",
> +        {0, {0}} },
> +    {"1.12",
> +        "SELECT '' LIKE 'A_';",
> +        {0, {0}} },
> +    {"1.13",
> +        "SELECT 'AB' LIKE 'A';",
> +        {0, {0}} },
> +    {"1.14",
> +        "SELECT 'CD' LIKE 'A';",
> +        {0, {0}} },
> +    {"1.15",
> +        "SELECT '' LIKE 'A';",
> +        {0, {0}} },
> +    {"1.16",
> +        "SELECT 'AB' LIKE '_';",
> +        {0, {0}} },
> +    {"1.17",
> +        "SELECT 'CD' LIKE '_';",
> +        {0, {0}} },
> +    {"1.18",
> +        "SELECT '' LIKE '_';",
> +        {0, {0}} },
> +    {"1.19",
> +        "SELECT 'AB' LIKE '__';",
> +        {0, {1}} },
> +    {"1.20",
> +        "SELECT 'CD' LIKE '__';",
> +        {0, {1}} },
> +    {"1.21",
> +        "SELECT '' LIKE '__';",
> +        {0, {0}} },
> +    {"1.22",
> +        "SELECT 'AB' LIKE '%A';",
> +        {0, {0}} },
> +    {"1.23",
> +        "SELECT 'AB' LIKE '%C';",
> +        {0, {0}} },
> +    {"1.24",
> +        "SELECT 'ab' LIKE '%df';",
> +        {0, {0}} },
> +    {"1.25",
> +        "SELECT 'abCDF' LIKE '%df';",
> +        {0, {1}} },
> +    {"1.26",
> +        "SELECT 'CDF' LIKE '%df';",
> +        {0, {1}} },
> +    {"1.27",
> +        "SELECT 'ab' LIKE 'a_';",
> +        {0, {1}} },
> +    {"1.28",
> +        "SELECT 'abCDF' LIKE 'a_';",
> +        {0, {0}} },
> +    {"1.29",
> +        "SELECT 'CDF' LIKE 'a_';",
> +        {0, {0}} },
> +    {"1.30",
> +        "SELECT 'ab' LIKE 'ab%';",
> +        {0, {1}} },
> +    {"1.31",
> +        "SELECT 'abCDF' LIKE 'ab%';",
> +        {0, {1}} },
> +    {"1.32",
> +        "SELECT 'CDF' LIKE 'ab%';",
> +        {0, {0}} },
> +    {"1.33",
> +        "SELECT 'ab' LIKE 'abC%';",
> +        {0, {0}} },
> +    {"1.34",
> +        "SELECT 'abCDF' LIKE 'abC%';",
> +        {0, {1}} },
> +    {"1.35",
> +        "SELECT 'CDF' LIKE 'abC%';",
> +        {0, {0}} },
> +    {"1.36",
> +        "SELECT 'ab' LIKE 'a_%';",
> +        {0, {1}} },
> +    {"1.37",
> +        "SELECT 'abCDF' LIKE 'a_%';",
> +        {0, {1}} },
> +    {"1.38",
> +        "SELECT 'CDF' LIKE 'a_%';",
> +        {0, {0}} },
> +}
Please, add some tests for unicode strings. (or replace letters in those 
tests with unicode letters)
> +
> +test:do_catchsql_set_test(like_test_cases, prefix)
> +
> +-- Invalid testcases.
> +for i, tested_string in ipairs(invalid_testcases) do
> +
> +    -- We should raise an error in case
> +    -- pattern contains invalid characters.
> +
> +    local test_name = prefix .. "2." .. tostring(i)
> +    local test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "';"
> +    test:do_catchsql_test(test_name, test_itself,
> +                          {1, "LIKE or GLOB pattern can only contain 
> UTF-8 characters"})
> +
> +    test_name = prefix .. "3." .. tostring(i)
> +    test_itself = "SELECT 'abc' LIKE 'abc" .. tested_string .. "';"
> +    test:do_catchsql_test(test_name, test_itself,
> +                          {1, "LIKE or GLOB pattern can only contain 
> UTF-8 characters"})
> +
> +    test_name = prefix .. "4." .. tostring(i)
> +    test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "c';"
> +    test:do_catchsql_test(test_name, test_itself,
> +                          {1, "LIKE or GLOB pattern can only contain 
> UTF-8 characters"})
> +
> +    -- Just skipping if row value predicand contains invalid character.
What the predicand is? Is it a typo?
> +
> +    test_name = prefix .. "5." .. tostring(i)
> +    test_itself = "SELECT 'ab" .. tested_string .. "' LIKE 'abc';"
> +    test:do_execsql_test(test_name, test_itself, {0})
> +
> +    test_name = prefix .. "6." .. tostring(i)
> +    test_itself = "SELECT 'abc" .. tested_string .. "' LIKE 'abc';"
> +    test:do_execsql_test(test_name, test_itself, {0})
> +
> +    test_name = prefix .. "7." .. tostring(i)
> +    test_itself = "SELECT 'ab" .. tested_string .. "c' LIKE 'abc';"
> +    test:do_execsql_test(test_name, test_itself, {0})
> +end
> +
> +-- Valid testcases.
> +for i, tested_string in ipairs(valid_testcases) do
> +    test_name = prefix .. "8." .. tostring(i)
> +    local test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "';"
> +    test:do_execsql_test(test_name, test_itself, {0})
> +
> +    test_name = prefix .. "9." .. tostring(i)
> +    test_itself = "SELECT 'abc' LIKE 'abc" .. tested_string .. "';"
> +    test:do_execsql_test(test_name, test_itself, {0})
> +
> +    test_name = prefix .. "10." .. tostring(i)
> +    test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "c';"
> +    test:do_execsql_test(test_name,test_itself, {0})
> +
> +    test_name = prefix .. "11." .. tostring(i)
> +    test_itself = "SELECT 'ab" .. tested_string .. "' LIKE 'abc';"
> +    test:do_execsql_test(test_name,test_itself, {0})
> +
> +    test_name = prefix .. "12." .. tostring(i)
> +    test_itself = "SELECT 'abc" .. tested_string .. "' LIKE 'abc';"
> +    test:do_execsql_test(test_name, test_itself, {0})
> +
> +    test_name = prefix .. "13." .. tostring(i)
> +    test_itself = "SELECT 'ab" .. tested_string .. "c' LIKE 'abc';"
> +    test:do_execsql_test(test_name, test_itself, {0})
> +end
> +
> +test:finish_test()
Why I cannot find a test of `GLOB`? Even if we delete it in the future, 
it should be tested. You can write much less tests for glob.
E.g. this
```
select '1' glob '[0-4]';
```
somewhy returns 0.

Sorry, some of the tests I ask you to write are a little out of scope of 
the ticket and they should already have been written.
But I suppose most of ambiguity should be clarified now. This ticket has 
raised important questions related to those functions.

--------------B332719D4B32C324E9E80FCF
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 01.08.2018 13:51, Nikita Tatunov
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=utf-8">
      <div dir="ltr">
        <div>diff --git a/src/box/sql/func.c b/src/box/sql/func.c</div>
        <div>index c06e3bd..7f93ef6 100644</div>
        <div>--- a/src/box/sql/func.c</div>
        <div>+++ b/src/box/sql/func.c</div>
        <div>@@ -617,13 +617,17 @@ struct compareInfo {</div>
        <div> <span style="white-space:pre">	</span>u8 noCase;<span style="white-space:pre">		</span>/*
          true to ignore case differences */</div>
        <div> };</div>
        <div> </div>
        <div>-/*</div>
        <div>- * For LIKE and GLOB matching on EBCDIC machines, assume
          that every</div>
        <div>- * character is exactly one byte in size.  Also, provde
          the Utf8Read()</div>
        <div>- * macro for fast reading of the next character in the
          common case where</div>
        <div>- * the next character is ASCII.</div>
        <div>+/**</div>
        <div>+ * Providing there are symbols in string s this</div>
        <div>+ * macro returns UTF-8 code of character and</div>
        <div>+ * promotes pointer to the next symbol in the string.</div>
        <div>+ * Otherwise return code is SQL_END_OF_STRING.</div>
        <div>  */</div>
        <div>-#define Utf8Read(s, e)    ucnv_getNextUChar(pUtf8conv,
          &amp;s, e, &amp;status)</div>
        <div>+#define Utf8Read(s, e) (((s) &lt; (e)) ? \</div>
        <div>+<span style="white-space:pre">	</span>ucnv_getNextUChar(pUtf8conv,
          &amp;(s), (e), &amp;(status)) : 0)</div>
      </div>
    </blockquote>
    [Later I will ask you to return this macro back, so, you may not do
    this]<br>
    As I understand, you are returning `0` from Utf8Read in case of end
    of the string.<br>
    Let's return `SQL_END_OF_STRING` instead of just `0`.<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>+</div>
        <div>+#define SQL_END_OF_STRING        0</div>
        <div>+#define SQL_INVALID_UTF8_SYMBOL  0xfffd</div>
        <div> </div>
        <div> static const struct compareInfo globInfo = { '*', '?',
          '[', 0 };</div>
        <div> </div>
        <div>@@ -638,19 +642,16 @@ static const struct compareInfo
          likeInfoNorm = { '%', '_', 0, 1 };</div>
        <div> static const struct compareInfo likeInfoAlt = { '%', '_',
          0, 0 };</div>
        <div> </div>
        <div> /*</div>
        <div>- * Possible error returns from patternMatch()</div>
        <div>+ * Possible error returns from sql_utf8_pattern_compare()</div>
        <div>  */</div>
        <div> #define SQLITE_MATCH             0</div>
        <div> #define SQLITE_NOMATCH           1</div>
        <div> #define SQLITE_NOWILDCARDMATCH   2</div>
        <div>+#define SQL_PROHIBITED_PATTERN   3</div>
      </div>
    </blockquote>
    I am not sure that the invalid (with invalid symbols) pattern can be
    called `prohibited`.<br>
    Rename somehow? My proposal: SQL_INVALID_PATTERN.<br>
    Moreover, You have named this definition with the `SQL` prefix,
    which is good, however,<br>
    similar definitions are still prefixed with `SQLITE`. I would like
    you to rename those in<br>
    this (preferred) or in a separate commit for consistency.
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div> </div>
        <div>-/*</div>
        <div>- * Compare two UTF-8 strings for equality where the first
          string is</div>
        <div>- * a GLOB or LIKE expression.  Return values:</div>
        <div>- *</div>
        <div>- *    SQLITE_MATCH:            Match</div>
        <div>- *    SQLITE_NOMATCH:          No match</div>
        <div>- *    SQLITE_NOWILDCARDMATCH:  No match in spite of having
          * or % wildcards.</div>
        <div>+/**</div>
        <div>+ * Compare two UTF-8 strings for equality where the first
          string</div>
        <div>+ * is a GLOB or LIKE expression.</div>
        <div>  *</div>
        <div>  * Globbing rules:</div>
        <div>  *</div>
        <div>@@ -663,92 +664,136 @@ static const struct compareInfo
          likeInfoAlt = { '%', '_', 0, 0 };</div>
        <div>  *</div>
        <div>  *     [^...]     Matches one character not in the
          enclosed list.</div>
        <div>  *</div>
        <div>- * With the [...] and [^...] matching, a ']' character can
          be included</div>
        <div>- * in the list by making it the first character after '['
          or '^'.  A</div>
        <div>- * range of characters can be specified using '-'. 
          Example:</div>
        <div>- * "[a-z]" matches any single lower-case letter.  To match
          a '-', make</div>
        <div>- * it the last character in the list.</div>
        <div>+ * With the [...] and [^...] matching, a ']' character can
          be</div>
        <div>+ * included in the list by making it the first character
          after</div>
        <div>+ * '[' or '^'. A range of characters can be specified
          using '-'.</div>
        <div>+ * Example: "[a-z]" matches any single lower-case letter.</div>
        <div>+ * To match a '-', make it the last character in the list.</div>
      </div>
    </blockquote>
    Does it work for UTF characters? I suppose no.<br>
    Let's write about it here + let's file an issue to make it<br>
    work with UTF characters.
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>  *</div>
        <div>  * Like matching rules:</div>
        <div>  *</div>
        <div>- *      '%'       Matches any sequence of zero or more
          characters</div>
        <div>+ *      '%'       Matches any sequence of zero or more
          characters.</div>
        <div>  *</div>
        <div>- **     '_'       Matches any one character</div>
        <div>+ **     '_'       Matches any one character.</div>
        <div>  *</div>
        <div>  *      Ec        Where E is the "esc" character and c is
          any other</div>
        <div>- *                character, including '%', '_', and esc,
          match exactly c.</div>
        <div>+ *                character, including '%', '_', and esc,
          match</div>
        <div>+ *                exactly c.</div>
        <div>  *</div>
        <div>  * The comments within this routine usually assume glob
          matching.</div>
        <div>  *</div>
        <div>- * This routine is usually quick, but can be N**2 in the
          worst case.</div>
        <div>+ * This routine is usually quick, but can be N**2 in the
          worst</div>
        <div>+ * case.</div>
        <div>+ *</div>
        <div>+ * @param pattern String containing comparison pattern.</div>
        <div>+ * @param string String being compared.</div>
        <div>+ * @param compareInfo Information about how to compare.</div>
        <div>+ * @param matchOther The escape char (LIKE) or '[' (GLOB).</div>
        <div>+ *</div>
        <div>+ * @retval SQLITE_MATCH:            Match.</div>
        <div>+ *<span style="white-space:pre">	</span> 
           SQLITE_NOMATCH:          No match.</div>
        <div>+ *<span style="white-space:pre">	</span> 
           SQLITE_NOWILDCARDMATCH:  No match in spite of having *</div>
        <div>+ *<span style="white-space:pre">				</span>    or %
          wildcards.</div>
        <div>+ *<span style="white-space:pre">	</span> 
           SQL_PROHIBITED_PATTERN:  Pattern contains invalid</div>
        <div>+ *<span style="white-space:pre">				</span>    symbol.</div>
      </div>
    </blockquote>
    Minor: It is not very good that you use symbol and character
    interchangeably.<br>
    I suppose that `character` should be used everywhere.
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>  */</div>
        <div> static int</div>
        <div>-patternCompare(const char * pattern,<span style="white-space:pre">	</span>/*
          The glob pattern */</div>
        <div>-<span style="white-space:pre">	</span>       const char *
          string,<span style="white-space:pre">	</span>/* The string to
          compare against the glob */</div>
        <div>-<span style="white-space:pre">	</span>       const struct
          compareInfo *pInfo,<span style="white-space:pre">	</span>/*
          Information about how to do the compare */</div>
        <div>-<span style="white-space:pre">	</span>       UChar32
          matchOther<span style="white-space:pre">	</span>/* The escape
          char (LIKE) or '[' (GLOB) */</div>
        <div>-    )</div>
        <div>+sql_utf8_pattern_compare(const char * pattern,</div>
        <div>+<span style="white-space:pre">			</span> const char *
          string,</div>
        <div>+<span style="white-space:pre">			</span> const struct
          compareInfo *pInfo,</div>
        <div>+<span style="white-space:pre">			</span> UChar32
          matchOther)</div>
      </div>
    </blockquote>
    "star" sign should stick to the attribute  name.<br>
<a class="moz-txt-link-freetext" href="https://tarantool.io/en/doc/1.9/dev_guide/c_style_guide/#chapter-3-1-spaces">https://tarantool.io/en/doc/1.9/dev_guide/c_style_guide/#chapter-3-1-spaces</a><br>
    <br>
    To prevent such typos in the future, you can use special perl<br>
    script which checks coding style in Linux kernel patches.
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div> {</div>
        <div>-<span style="white-space:pre">	</span>UChar32 c, c2;<span style="white-space:pre">		</span>/*
          Next pattern and input string chars */</div>
        <div>-<span style="white-space:pre">	</span>UChar32 matchOne =
          pInfo-&gt;matchOne;<span style="white-space:pre">	</span>/*
          "?" or "_" */</div>
        <div>-<span style="white-space:pre">	</span>UChar32 matchAll =
          pInfo-&gt;matchAll;<span style="white-space:pre">	</span>/*
          "*" or "%" */</div>
        <div>-<span style="white-space:pre">	</span>UChar32 noCase =
          pInfo-&gt;noCase;<span style="white-space:pre">	</span>/* True
          if uppercase==lowercase */</div>
        <div>-<span style="white-space:pre">	</span>const char *zEscaped
          = 0;<span style="white-space:pre">	</span>/* One past the last
          escaped input char */</div>
        <div>+<span style="white-space:pre">	</span>/* Next pattern and
          input string chars */</div>
        <div>+<span style="white-space:pre">	</span>UChar32 c, c2;</div>
        <div>+<span style="white-space:pre">	</span>/* "?" or "_" */</div>
        <div>+<span style="white-space:pre">	</span>UChar32 matchOne =
          pInfo-&gt;matchOne;</div>
        <div>+<span style="white-space:pre">	</span>/* "*" or "%" */</div>
        <div>+<span style="white-space:pre">	</span>UChar32 matchAll =
          pInfo-&gt;matchAll;</div>
        <div>+<span style="white-space:pre">	</span>/* True if
          uppercase==lowercase */</div>
        <div>+<span style="white-space:pre">	</span>UChar32 noCase =
          pInfo-&gt;noCase;</div>
        <div>+<span style="white-space:pre">	</span>/* One past the last
          escaped input char */</div>
        <div>+<span style="white-space:pre">	</span>const char *zEscaped
          = 0;</div>
        <div> <span style="white-space:pre">	</span>const char *
          pattern_end = pattern + strlen(pattern);</div>
        <div> <span style="white-space:pre">	</span>const char *
          string_end = string + strlen(string);</div>
        <div> <span style="white-space:pre">	</span>UErrorCode status =
          U_ZERO_ERROR;</div>
        <div> </div>
        <div>-<span style="white-space:pre">	</span>while (pattern &lt;
          pattern_end){</div>
        <div>-<span style="white-space:pre">		</span>c =
          Utf8Read(pattern, pattern_end);</div>
        <div>+<span style="white-space:pre">	</span>while ((c =
          Utf8Read(pattern, pattern_end)) != SQL_END_OF_STRING) {</div>
      </div>
    </blockquote>
    REMEMBER THIS POINT #1<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>+<span style="white-space:pre">		</span>if (c ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">			</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div> <span style="white-space:pre">		</span>if (c == matchAll)
          {<span style="white-space:pre">	</span>/* Match "*" */</div>
        <div>-<span style="white-space:pre">			</span>/* Skip over
          multiple "*" characters in the pattern.  If there</div>
        <div>-<span style="white-space:pre">			</span> * are also "?"
          characters, skip those as well, but consume a</div>
        <div>-<span style="white-space:pre">			</span> * single
          character of the input string for each "?" skipped</div>
        <div>+<span style="white-space:pre">			</span>/* Skip over
          multiple "*" characters in</div>
        <div>+<span style="white-space:pre">			</span> * the pattern. If
          there are also "?"</div>
        <div>+<span style="white-space:pre">			</span> * characters,
          skip those as well, but</div>
        <div>+<span style="white-space:pre">			</span> * consume a
          single character of the</div>
        <div>+<span style="white-space:pre">			</span> * input string
          for each "?" skipped.</div>
        <div> <span style="white-space:pre">			</span> */</div>
        <div>-<span style="white-space:pre">			</span>while (pattern
          &lt; pattern_end){</div>
        <div>-<span style="white-space:pre">				</span>c =
          Utf8Read(pattern, pattern_end);</div>
        <div>+<span style="white-space:pre">			</span>while ((c =
          Utf8Read(pattern, pattern_end)) !=</div>
        <div>+<span style="white-space:pre">			</span>     
           SQL_END_OF_STRING) {</div>
        <div>+<span style="white-space:pre">				</span>if (c ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">					</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div> <span style="white-space:pre">				</span>if (c != matchAll
          &amp;&amp; c != matchOne)</div>
        <div> <span style="white-space:pre">					</span>break;</div>
        <div>-<span style="white-space:pre">				</span>if (c == matchOne</div>
        <div>-<span style="white-space:pre">				</span>    &amp;&amp;
          Utf8Read(string, string_end) == 0) {</div>
        <div>+<span style="white-space:pre">				</span>if (c == matchOne
          &amp;&amp;</div>
        <div>+<span style="white-space:pre">				</span>    (c2 =
          Utf8Read(string, string_end)) ==</div>
        <div>+<span style="white-space:pre">				</span>   
          SQL_END_OF_STRING)</div>
        <div> <span style="white-space:pre">					</span>return
          SQLITE_NOWILDCARDMATCH;</div>
        <div>-<span style="white-space:pre">				</span>}</div>
        <div>+<span style="white-space:pre">				</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">					</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">			</span>}</div>
        <div>-<span style="white-space:pre">			</span>/* "*" at the end
          of the pattern matches */</div>
        <div>-<span style="white-space:pre">			</span>if (pattern ==
          pattern_end)</div>
        <div>+<span style="white-space:pre">			</span>/*</div>
        <div>+<span style="white-space:pre">			</span> * "*" at the end
          of the pattern matches.</div>
        <div>+<span style="white-space:pre">			</span> */</div>
        <div>+<span style="white-space:pre">			</span>if (c ==
          SQL_END_OF_STRING) {</div>
        <div>+<span style="white-space:pre">				</span>while ((c2 =
          Utf8Read(string, string_end)) !=</div>
        <div>+<span style="white-space:pre">				</span>     
           SQL_END_OF_STRING)</div>
        <div>+<span style="white-space:pre">					</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">						</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">				</span>return
          SQLITE_MATCH;</div>
        <div>+<span style="white-space:pre">			</span>}</div>
        <div> <span style="white-space:pre">			</span>if (c ==
          matchOther) {</div>
        <div> <span style="white-space:pre">				</span>if
          (pInfo-&gt;matchSet == 0) {</div>
        <div> <span style="white-space:pre">					</span>c =
          Utf8Read(pattern, pattern_end);</div>
        <div>-<span style="white-space:pre">					</span>if (c == 0)</div>
        <div>+<span style="white-space:pre">					</span>if (c ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">						</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div>+<span style="white-space:pre">					</span>if (c ==
          SQL_END_OF_STRING)</div>
        <div> <span style="white-space:pre">						</span>return
          SQLITE_NOWILDCARDMATCH;</div>
        <div> <span style="white-space:pre">				</span>} else {</div>
        <div>-<span style="white-space:pre">					</span>/* "[...]"
          immediately follows the "*".  We have to do a slow</div>
        <div>-<span style="white-space:pre">					</span> * recursive
          search in this case, but it is an unusual case.</div>
        <div>+<span style="white-space:pre">					</span>/* "[...]"
          immediately</div>
        <div>+<span style="white-space:pre">					</span> * follows the
          "*". We</div>
        <div>+<span style="white-space:pre">					</span> * have to do a
          slow</div>
        <div>+<span style="white-space:pre">					</span> * recursive
          search in</div>
        <div>+<span style="white-space:pre">					</span> * this case,
          but it is</div>
        <div>+<span style="white-space:pre">					</span> * an unusual
          case.</div>
        <div> <span style="white-space:pre">					</span> */</div>
        <div>-<span style="white-space:pre">					</span>assert(matchOther
          &lt; 0x80);<span style="white-space:pre">	</span>/* '[' is a
          single-byte character */</div>
        <div>+<span style="white-space:pre">					</span>assert(matchOther
          &lt; 0x80);</div>
        <div> <span style="white-space:pre">					</span>while (string
          &lt; string_end) {</div>
      </div>
    </blockquote>
    REMEMBER THIS POINT #2<br>
    <br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div> <span style="white-space:pre">						</span>int bMatch =</div>
        <div>-<span style="white-space:pre">						</span>   
          patternCompare(&amp;pattern[-1],</div>
        <div>-<span style="white-space:pre">								</span>   string,</div>
        <div>-<span style="white-space:pre">								</span>   pInfo,</div>
        <div>-<span style="white-space:pre">								</span> 
           matchOther);</div>
        <div>+<span style="white-space:pre">						</span>   
          sql_utf8_pattern_compare(</div>
        <div>+<span style="white-space:pre">								</span>&amp;pattern[-1],</div>
        <div>+<span style="white-space:pre">								</span>string,</div>
        <div>+<span style="white-space:pre">								</span>pInfo,</div>
        <div>+<span style="white-space:pre">								</span>matchOther);</div>
        <div> <span style="white-space:pre">						</span>if (bMatch !=
          SQLITE_NOMATCH)</div>
        <div> <span style="white-space:pre">							</span>return bMatch;</div>
        <div>-<span style="white-space:pre">						</span>Utf8Read(string,
          string_end);</div>
        <div>+<span style="white-space:pre">						</span>c =
          Utf8Read(string, string_end);</div>
        <div>+<span style="white-space:pre">						</span>if (c ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">							</span>return
          SQLITE_NOMATCH;</div>
      </div>
    </blockquote>
    look at &lt;REMEMBER THIS POINT #1,2&gt; and other `Utf8Read`
    usages.<br>
    You have introduced SQL_END_OF_STRING and changed `Utf8Read` pattern
    to use it in<br>
    half of cases?<br>
    <br>
    Moreover,in that place you do check `string &lt; string_end`
    implicitly inside of<br>
    `Utf8Read` but you never use that result.<br>
    <br>
    I suppose you should return old iteration style and `Utf8Read`
    macro.<br>
    ```<br>
    while (string &lt; string_end) {<br>
    <span style="white-space:pre"></span>    c = Utf8Read(string,
    string_end);
    <div><span style="white-space:pre">    </span>if (c ==
      SQL_INVALID_UTF8_SYMBOL)<br>
             <span style="white-space:pre"> </span>return
      SQLITE_NOMATCH;</div>
    ```<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div> <span style="white-space:pre">					</span>}</div>
        <div> <span style="white-space:pre">					</span>return
          SQLITE_NOWILDCARDMATCH;</div>
        <div> <span style="white-space:pre">				</span>}</div>
        <div> <span style="white-space:pre">			</span>}</div>
        <div> </div>
        <div>-<span style="white-space:pre">			</span>/* At this point
          variable c contains the first character of the</div>
        <div>-<span style="white-space:pre">			</span> * pattern string
          past the "*".  Search in the input string for the</div>
        <div>-<span style="white-space:pre">			</span> * first matching
          character and recursively continue the match from</div>
        <div>-<span style="white-space:pre">			</span> * that point.</div>
        <div>+<span style="white-space:pre">			</span>/* At this point
          variable c contains the</div>
        <div>+<span style="white-space:pre">			</span> * first character
          of the pattern string</div>
        <div>+<span style="white-space:pre">			</span> * past the "*".
          Search in the input</div>
        <div>+<span style="white-space:pre">			</span> * string for the
          first matching</div>
        <div>+<span style="white-space:pre">			</span> * character and
          recursively continue the</div>
        <div>+<span style="white-space:pre">			</span> * match from that
          point.</div>
        <div> <span style="white-space:pre">			</span> *</div>
        <div>-<span style="white-space:pre">			</span> * For a
          case-insensitive search, set variable cx to be the same as</div>
        <div>-<span style="white-space:pre">			</span> * c but in the
          other case and search the input string for either</div>
        <div>-<span style="white-space:pre">			</span> * c or cx.</div>
        <div>+<span style="white-space:pre">			</span> * For a
          case-insensitive search, set</div>
        <div>+<span style="white-space:pre">			</span> * variable cx to
          be the same as c but in</div>
        <div>+<span style="white-space:pre">			</span> * the other case
          and search the input</div>
        <div>+<span style="white-space:pre">			</span> * string for
          either c or cx.</div>
        <div> <span style="white-space:pre">			</span> */</div>
        <div> </div>
        <div> <span style="white-space:pre">			</span>int bMatch;</div>
        <div>@@ -756,14 +801,18 @@ patternCompare(const char * pattern,<span style="white-space:pre">	</span>/*
          The glob pattern */</div>
        <div> <span style="white-space:pre">				</span>c = u_tolower(c);</div>
        <div> <span style="white-space:pre">			</span>while (string &lt;
          string_end){</div>
        <div> <span style="white-space:pre">				</span>/**</div>
        <div>-<span style="white-space:pre">				</span> * This loop
          could have been implemented</div>
        <div>-<span style="white-space:pre">				</span> * without if
          converting c2 to lower case</div>
        <div>-<span style="white-space:pre">				</span> * (by holding
          c_upper and c_lower), however</div>
        <div>-<span style="white-space:pre">				</span> * it is
          implemented this way because lower</div>
        <div>-<span style="white-space:pre">				</span> * works better
          with German and Turkish</div>
        <div>-<span style="white-space:pre">				</span> * languages.</div>
        <div>+<span style="white-space:pre">				</span> * This loop
          could have been</div>
        <div>+<span style="white-space:pre">				</span> * implemented
          without if</div>
        <div>+<span style="white-space:pre">				</span> * converting c2
          to lower case</div>
        <div>+<span style="white-space:pre">				</span> * by holding
          c_upper and</div>
        <div>+<span style="white-space:pre">				</span> *
          c_lower,however it is</div>
        <div>+<span style="white-space:pre">				</span> * implemented
          this way because</div>
        <div>+<span style="white-space:pre">				</span> * lower works
          better with German</div>
        <div>+<span style="white-space:pre">				</span> * and Turkish
          languages.</div>
        <div> <span style="white-space:pre">				</span> */</div>
        <div> <span style="white-space:pre">				</span>c2 =
          Utf8Read(string, string_end);</div>
        <div>+<span style="white-space:pre">				</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">					</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">				</span>if (!noCase) {</div>
        <div> <span style="white-space:pre">					</span>if (c2 != c)</div>
        <div> <span style="white-space:pre">						</span>continue;</div>
        <div>@@ -771,9 +820,10 @@ patternCompare(const char * pattern,<span style="white-space:pre">	</span>/*
          The glob pattern */</div>
        <div> <span style="white-space:pre">					</span>if (c2 != c
          &amp;&amp; u_tolower(c2) != c)</div>
        <div> <span style="white-space:pre">						</span>continue;</div>
        <div> <span style="white-space:pre">				</span>}</div>
        <div>-<span style="white-space:pre">				</span>bMatch =</div>
        <div>-<span style="white-space:pre">				</span>   
          patternCompare(pattern, string,</div>
        <div>-<span style="white-space:pre">						</span>   pInfo,
          matchOther);</div>
        <div>+<span style="white-space:pre">				</span>bMatch =
          sql_utf8_pattern_compare(pattern,</div>
        <div>+<span style="white-space:pre">								</span>  string,</div>
        <div>+<span style="white-space:pre">								</span>  pInfo,</div>
        <div>+<span style="white-space:pre">								</span> 
          matchOther);</div>
        <div> <span style="white-space:pre">				</span>if (bMatch !=
          SQLITE_NOMATCH)</div>
        <div> <span style="white-space:pre">					</span>return bMatch;</div>
        <div> <span style="white-space:pre">			</span>}</div>
        <div>@@ -782,7 +832,9 @@ patternCompare(const char * pattern,<span style="white-space:pre">	</span>/*
          The glob pattern */</div>
        <div> <span style="white-space:pre">		</span>if (c ==
          matchOther) {</div>
        <div> <span style="white-space:pre">			</span>if
          (pInfo-&gt;matchSet == 0) {</div>
        <div> <span style="white-space:pre">				</span>c =
          Utf8Read(pattern, pattern_end);</div>
        <div>-<span style="white-space:pre">				</span>if (c == 0)</div>
        <div>+<span style="white-space:pre">				</span>if (c ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">					</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div>+<span style="white-space:pre">				</span>if (c ==
          SQL_END_OF_STRING)</div>
        <div> <span style="white-space:pre">					</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">				</span>zEscaped =
          pattern;</div>
        <div> <span style="white-space:pre">			</span>} else {</div>
        <div>@@ -790,23 +842,33 @@ patternCompare(const char * pattern,<span style="white-space:pre">	</span>/*
          The glob pattern */</div>
        <div> <span style="white-space:pre">				</span>int seen = 0;</div>
        <div> <span style="white-space:pre">				</span>int invert = 0;</div>
        <div> <span style="white-space:pre">				</span>c =
          Utf8Read(string, string_end);</div>
        <div>+<span style="white-space:pre">				</span>if (c ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">					</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">				</span>if (string ==
          string_end)</div>
        <div> <span style="white-space:pre">					</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">				</span>c2 =
          Utf8Read(pattern, pattern_end);</div>
        <div>+<span style="white-space:pre">				</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">					</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div> <span style="white-space:pre">				</span>if (c2 == '^') {</div>
        <div> <span style="white-space:pre">					</span>invert = 1;</div>
        <div> <span style="white-space:pre">					</span>c2 =
          Utf8Read(pattern, pattern_end);</div>
        <div>+<span style="white-space:pre">					</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">						</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div> <span style="white-space:pre">				</span>}</div>
        <div> <span style="white-space:pre">				</span>if (c2 == ']') {</div>
        <div> <span style="white-space:pre">					</span>if (c == ']')</div>
        <div> <span style="white-space:pre">						</span>seen = 1;</div>
        <div> <span style="white-space:pre">					</span>c2 =
          Utf8Read(pattern, pattern_end);</div>
        <div>+<span style="white-space:pre">					</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">						</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div> <span style="white-space:pre">				</span>}</div>
        <div>-<span style="white-space:pre">				</span>while (c2
          &amp;&amp; c2 != ']') {</div>
        <div>+<span style="white-space:pre">				</span>while (c2 !=
          SQL_END_OF_STRING &amp;&amp; c2 != ']') {</div>
        <div> <span style="white-space:pre">					</span>if (c2 == '-'
          &amp;&amp; pattern[0] != ']'</div>
        <div> <span style="white-space:pre">					</span>    &amp;&amp;
          pattern &lt; pattern_end</div>
        <div> <span style="white-space:pre">					</span>    &amp;&amp;
          prior_c &gt; 0) {</div>
        <div> <span style="white-space:pre">						</span>c2 =
          Utf8Read(pattern, pattern_end);</div>
        <div>+<span style="white-space:pre">						</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">							</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div> <span style="white-space:pre">						</span>if (c &gt;=
          prior_c &amp;&amp; c &lt;= c2)</div>
        <div> <span style="white-space:pre">							</span>seen = 1;</div>
        <div> <span style="white-space:pre">						</span>prior_c = 0;</div>
        <div>@@ -817,29 +879,36 @@ patternCompare(const char * pattern,<span style="white-space:pre">	</span>/*
          The glob pattern */</div>
        <div> <span style="white-space:pre">						</span>prior_c = c2;</div>
        <div> <span style="white-space:pre">					</span>}</div>
        <div> <span style="white-space:pre">					</span>c2 =
          Utf8Read(pattern, pattern_end);</div>
        <div>+<span style="white-space:pre">					</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">						</span>return
          SQL_PROHIBITED_PATTERN;</div>
        <div> <span style="white-space:pre">				</span>}</div>
        <div>-<span style="white-space:pre">				</span>if (pattern ==
          pattern_end || (seen ^ invert) == 0) {</div>
        <div>+<span style="white-space:pre">				</span>if (pattern ==
          pattern_end ||</div>
        <div>+<span style="white-space:pre">				</span>    (seen ^
          invert) == 0) {</div>
        <div> <span style="white-space:pre">					</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">				</span>}</div>
        <div> <span style="white-space:pre">				</span>continue;</div>
        <div> <span style="white-space:pre">			</span>}</div>
        <div> <span style="white-space:pre">		</span>}</div>
        <div> <span style="white-space:pre">		</span>c2 =
          Utf8Read(string, string_end);</div>
        <div>+<span style="white-space:pre">		</span>if (c2 ==
          SQL_INVALID_UTF8_SYMBOL)</div>
        <div>+<span style="white-space:pre">			</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">		</span>if (c == c2)</div>
        <div> <span style="white-space:pre">			</span>continue;</div>
        <div> <span style="white-space:pre">		</span>if (noCase){</div>
        <div> <span style="white-space:pre">			</span>/**</div>
        <div>-<span style="white-space:pre">			</span> * Small
          optimisation. Reduce number of calls</div>
        <div>-<span style="white-space:pre">			</span> * to u_tolower
          function.</div>
        <div>-<span style="white-space:pre">			</span> * SQL standards
          suggest use to_upper for symbol</div>
        <div>-<span style="white-space:pre">			</span> * normalisation.
          However, using to_lower allows to</div>
        <div>-<span style="white-space:pre">			</span> * respect Turkish
          'İ' in default locale.</div>
        <div>+<span style="white-space:pre">			</span> * Small
          optimisation. Reduce number of</div>
        <div>+<span style="white-space:pre">			</span> * calls to
          u_tolower function. SQL</div>
        <div>+<span style="white-space:pre">			</span> * standards
          suggest use to_upper for</div>
        <div>+<span style="white-space:pre">			</span> * symbol
          normalisation. However, using</div>
        <div>+<span style="white-space:pre">			</span> * to_lower allows
          to respect Turkish 'İ'</div>
        <div>+<span style="white-space:pre">			</span> * in default
          locale.</div>
        <div> <span style="white-space:pre">			</span> */</div>
        <div> <span style="white-space:pre">			</span>if (u_tolower(c)
          == c2 ||</div>
        <div> <span style="white-space:pre">			</span>    c ==
          u_tolower(c2))</div>
        <div> <span style="white-space:pre">				</span>continue;</div>
        <div> <span style="white-space:pre">		</span>}</div>
        <div>-<span style="white-space:pre">		</span>if (c == matchOne
          &amp;&amp; pattern != zEscaped &amp;&amp; c2 != 0)</div>
        <div>+<span style="white-space:pre">		</span>if (c == matchOne
          &amp;&amp; pattern != zEscaped &amp;&amp;</div>
        <div>+<span style="white-space:pre">		</span>    c2 !=
          SQL_END_OF_STRING)</div>
        <div> <span style="white-space:pre">			</span>continue;</div>
        <div> <span style="white-space:pre">		</span>return
          SQLITE_NOMATCH;</div>
        <div> <span style="white-space:pre">	</span>}</div>
        <div>@@ -853,8 +922,7 @@ patternCompare(const char * pattern,<span style="white-space:pre">	</span>/*
          The glob pattern */</div>
        <div> int</div>
        <div> sqlite3_strglob(const char *zGlobPattern, const char
          *zString)</div>
        <div> {</div>
        <div>-<span style="white-space:pre">	</span>return
          patternCompare(zGlobPattern, zString, &amp;globInfo,</div>
        <div>-<span style="white-space:pre">			</span>      '[');</div>
        <div>+<span style="white-space:pre">	</span>return
          sql_utf8_pattern_compare(zGlobPattern, zString, &amp;globInfo,
          '[');</div>
        <div> }</div>
        <div> </div>
        <div> /*</div>
        <div>@@ -864,7 +932,7 @@ sqlite3_strglob(const char
          *zGlobPattern, const char *zString)</div>
        <div> int</div>
        <div> sqlite3_strlike(const char *zPattern, const char *zStr,
          unsigned int esc)</div>
        <div> {</div>
        <div>-<span style="white-space:pre">	</span>return
          patternCompare(zPattern, zStr, &amp;likeInfoNorm, esc);</div>
        <div>+<span style="white-space:pre">	</span>return
          sql_utf8_pattern_compare(zPattern, zStr, &amp;likeInfoNorm,
          esc);</div>
        <div> }</div>
        <div> </div>
        <div> /*</div>
        <div>@@ -910,8 +978,9 @@ likeFunc(sqlite3_context * context, int
          argc, sqlite3_value ** argv)</div>
        <div> <span style="white-space:pre">	</span>zB = (const char *)
          sqlite3_value_text(argv[0]);</div>
        <div> <span style="white-space:pre">	</span>zA = (const char *)
          sqlite3_value_text(argv[1]);</div>
        <div> </div>
        <div>-<span style="white-space:pre">	</span>/* Limit the length
          of the LIKE or GLOB pattern to avoid problems</div>
        <div>-<span style="white-space:pre">	</span> * of deep recursion
          and N*N behavior in patternCompare().</div>
        <div>+<span style="white-space:pre">	</span>/* Limit the length
          of the LIKE or GLOB pattern to avoid</div>
        <div>+<span style="white-space:pre">	</span> * problems of deep
          recursion and N*N behavior in</div>
        <div>+<span style="white-space:pre">	</span> *
          sql_utf8_pattern_compare().</div>
        <div> <span style="white-space:pre">	</span> */</div>
        <div> <span style="white-space:pre">	</span>nPat =
          sqlite3_value_bytes(argv[0]);</div>
        <div> <span style="white-space:pre">	</span>testcase(nPat ==
          db-&gt;aLimit[SQLITE_LIMIT_LIKE_PATTERN_LENGTH]);</div>
        <div>@@ -947,7 +1016,12 @@ likeFunc(sqlite3_context * context,
          int argc, sqlite3_value ** argv)</div>
        <div> <span style="white-space:pre">	</span>sqlite3_like_count++;</div>
        <div> #endif</div>
        <div> <span style="white-space:pre">	</span>int res;</div>
        <div>-<span style="white-space:pre">	</span>res =
          patternCompare(zB, zA, pInfo, escape);</div>
        <div>+<span style="white-space:pre">	</span>res =
          sql_utf8_pattern_compare(zB, zA, pInfo, escape);</div>
        <div>+<span style="white-space:pre">	</span>if (res ==
          SQL_PROHIBITED_PATTERN) {</div>
        <div>+<span style="white-space:pre">		</span>sqlite3_result_error(context,
          "LIKE or GLOB pattern can only"</div>
        <div>+<span style="white-space:pre">				</span>     " contain
          UTF-8 characters", -1);</div>
        <div>+<span style="white-space:pre">		</span>return;</div>
        <div>+<span style="white-space:pre">	</span>}</div>
        <div> <span style="white-space:pre">	</span>sqlite3_result_int(context,
          res == SQLITE_MATCH);</div>
        <div> }</div>
        <div> </div>
        <div>diff --git a/test-run b/test-run</div>
        <div>index 77e9327..95562e9 160000</div>
        <div>--- a/test-run</div>
        <div>+++ b/test-run</div>
        <div>@@ -1 +1 @@</div>
        <div>-Subproject commit 77e93279210f8c5c1fd0ed03416fa19a184f0b6d</div>
        <div>+Subproject commit 95562e95401fef4e0b755ab0bb430974b5d1a29a</div>
        <div>diff --git a/test/sql-tap/e_expr.test.lua
          b/test/sql-tap/e_expr.test.lua</div>
        <div>index 13d3a96..9780d2c 100755</div>
        <div>--- a/test/sql-tap/e_expr.test.lua</div>
        <div>+++ b/test/sql-tap/e_expr.test.lua</div>
        <div>@@ -1,6 +1,6 @@</div>
        <div> #!/usr/bin/env tarantool</div>
        <div> test = require("sqltester")</div>
        <div>-test:plan(12431)</div>
        <div>+test:plan(10665)</div>
        <div> </div>
        <div> --!./tcltestrunner.lua</div>
        <div> -- 2010 July 16</div>
        <div>@@ -77,8 +77,10 @@ local operations = {</div>
        <div>     {"&lt;&gt;", "ne1"},</div>
        <div>     {"!=", "ne2"},</div>
        <div>     {"IS", "is"},</div>
        <div>-    {"LIKE", "like"},</div>
        <div>-    {"GLOB", "glob"},</div>
        <div>+-- NOTE: This test needs refactoring after deletion of
          GLOB &amp;</div>
        <div>+--<span style="white-space:pre">	</span> type restrictions
          for LIKE. (See #3572)</div>
        <div>+--    {"LIKE", "like"},</div>
        <div>+--    {"GLOB", "glob"},</div>
      </div>
    </blockquote>
    Yes, this behavior is not valid anymore.<br>
    To make sure that likes and globs will be tested in the future,
    please, delete this<br>
    commented lines and add your own simple test, which tries to call
    `like` and `glob`<br>
    with inappropriate types.<br>
    It is important to have a functional tests for any possible
    behavior.<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>     {"AND", "and"},</div>
        <div>     {"OR", "or"},</div>
        <div>     {"MATCH", "match"},</div>
        <div>@@ -96,7 +98,12 @@ operations = {</div>
        <div>     {"+", "-"},</div>
        <div>     {"&lt;&lt;", "&gt;&gt;", "&amp;", "|"},</div>
        <div>     {"&lt;", "&lt;=", "&gt;", "&gt;="},</div>
        <div>-    {"=", "==", "!=", "&lt;&gt;", "LIKE", "GLOB"},
          --"MATCH", "REGEXP"},</div>
        <div>+-- NOTE: This test needs refactoring after deletion of
          GLOB &amp;</div>
        <div>+--<span style="white-space:pre">	</span> type restrictions
          for LIKE. (See #3572)</div>
        <div>+-- Another NOTE: MATCH &amp; REGEXP aren't supported in
          Tarantool &amp;</div>
        <div>+-- <span style="white-space:pre">		</span> are waiting
          for their hour, don't confuse them</div>
        <div>+--<span style="white-space:pre">		</span> being commented
          with ticket above.</div>
        <div>+    {"=", "==", "!=", "&lt;&gt;"}, --"LIKE", "GLOB"},
          --"MATCH", "REGEXP"},</div>
        <div>     {"AND"},</div>
        <div>     {"OR"},</div>
        <div> }</div>
        <div>@@ -475,6 +482,7 @@ for _, op in ipairs(oplist) do</div>
        <div>         end</div>
        <div>     end</div>
        <div> end</div>
        <div>+</div>
        <div> ---------------------------------------------------------------------------</div>
        <div> -- Test the IS and IS NOT operators.</div>
        <div> --</div>
        <div>diff --git
          a/test/sql-tap/gh-3251-string-pattern-comparison.test.lua
          b/test/sql-tap/gh-3251-string-pattern-comparison.test.lua</div>
        <div>new file mode 100755</div>
        <div>index 0000000..2a787f2</div>
        <div>--- /dev/null</div>
        <div>+++
          b/test/sql-tap/gh-3251-string-pattern-comparison.test.lua</div>
        <div>@@ -0,0 +1,213 @@</div>
        <div>+#!/usr/bin/env tarantool</div>
        <div>+test = require("sqltester")</div>
        <div>+test:plan(128)</div>
        <div>+</div>
        <div>+local prefix = "like-test-"</div>
        <div>+</div>
        <div>+-- Unicode byte sequences.</div>
        <div>+local valid_testcases = {</div>
        <div>+    '\x01',</div>
        <div>+    '\x09',</div>
        <div>+    '\x1F',</div>
        <div>+    '\x7F',</div>
        <div>+    '\xC2\x80',</div>
        <div>+    '\xC2\x90',</div>
        <div>+    '\xC2\x9F',</div>
        <div>+    '\xE2\x80\xA8',</div>
        <div>+    '\x20\x0B',</div>
        <div>+    '\xE2\x80\xA9',</div>
        <div>+}</div>
      </div>
    </blockquote>
    optional: add descriptions to those byte sequences (what it is).<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>+</div>
        <div>+-- Non-Unicode byte sequences.</div>
        <div>+local invalid_testcases = {</div>
        <div>+    '\xE2\x80',</div>
        <div>+    '\xFE\xFF',</div>
        <div>+    '\xC2',</div>
        <div>+    '\xED\xB0\x80',</div>
        <div>+    '\xD0',</div>
        <div>+}</div>
      </div>
    </blockquote>
    Place that after like_test_cases, just before it is used.<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>+</div>
        <div>+local like_test_cases =</div>
        <div>+{</div>
        <div>+    {"1.1",</div>
        <div>+        "SELECT 'AB' LIKE '_B';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.2",</div>
        <div>+        "SELECT 'CD' LIKE '_B';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.3",</div>
        <div>+        "SELECT '' LIKE '_B';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.4",</div>
        <div>+        "SELECT 'AB' LIKE '%B';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.5",</div>
        <div>+        "SELECT 'CD' LIKE '%B';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.6",</div>
        <div>+        "SELECT '' LIKE '%B';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.7",</div>
        <div>+        "SELECT 'AB' LIKE 'A__';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.8",</div>
        <div>+        "SELECT 'CD' LIKE 'A__';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.9",</div>
        <div>+        "SELECT '' LIKE 'A__';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.10",</div>
        <div>+        "SELECT 'AB' LIKE 'A_';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.11",</div>
        <div>+        "SELECT 'CD' LIKE 'A_';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.12",</div>
        <div>+        "SELECT '' LIKE 'A_';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.13",</div>
        <div>+        "SELECT 'AB' LIKE 'A';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.14",</div>
        <div>+        "SELECT 'CD' LIKE 'A';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.15",</div>
        <div>+        "SELECT '' LIKE 'A';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.16",</div>
        <div>+        "SELECT 'AB' LIKE '_';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.17",</div>
        <div>+        "SELECT 'CD' LIKE '_';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.18",</div>
        <div>+        "SELECT '' LIKE '_';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.19",</div>
        <div>+        "SELECT 'AB' LIKE '__';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.20",</div>
        <div>+        "SELECT 'CD' LIKE '__';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.21",</div>
        <div>+        "SELECT '' LIKE '__';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.22",</div>
        <div>+        "SELECT 'AB' LIKE '%A';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.23",</div>
        <div>+        "SELECT 'AB' LIKE '%C';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.24",</div>
        <div>+        "SELECT 'ab' LIKE '%df';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.25",</div>
        <div>+        "SELECT 'abCDF' LIKE '%df';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.26",</div>
        <div>+        "SELECT 'CDF' LIKE '%df';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.27",</div>
        <div>+        "SELECT 'ab' LIKE 'a_';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.28",</div>
        <div>+        "SELECT 'abCDF' LIKE 'a_';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.29",</div>
        <div>+        "SELECT 'CDF' LIKE 'a_';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.30",</div>
        <div>+        "SELECT 'ab' LIKE 'ab%';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.31",</div>
        <div>+        "SELECT 'abCDF' LIKE 'ab%';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.32",</div>
        <div>+        "SELECT 'CDF' LIKE 'ab%';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.33",</div>
        <div>+        "SELECT 'ab' LIKE 'abC%';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.34",</div>
        <div>+        "SELECT 'abCDF' LIKE 'abC%';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.35",</div>
        <div>+        "SELECT 'CDF' LIKE 'abC%';",</div>
        <div>+        {0, {0}} },</div>
        <div>+    {"1.36",</div>
        <div>+        "SELECT 'ab' LIKE 'a_%';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.37",</div>
        <div>+        "SELECT 'abCDF' LIKE 'a_%';",</div>
        <div>+        {0, {1}} },</div>
        <div>+    {"1.38",</div>
        <div>+        "SELECT 'CDF' LIKE 'a_%';",</div>
        <div>+        {0, {0}} },</div>
        <div>+}</div>
      </div>
    </blockquote>
    Please, add some tests for unicode strings. (or replace letters in
    those tests with unicode letters)<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>+</div>
        <div>+test:do_catchsql_set_test(like_test_cases, prefix)</div>
        <div>+</div>
        <div>+-- Invalid testcases.</div>
        <div>+for i, tested_string in ipairs(invalid_testcases) do</div>
        <div>+</div>
        <div>+    -- We should raise an error in case</div>
        <div>+    -- pattern contains invalid characters.</div>
        <div>+</div>
        <div>+    local test_name = prefix .. "2." .. tostring(i)</div>
        <div>+    local test_itself = "SELECT 'abc' LIKE 'ab" ..
          tested_string .. "';"</div>
        <div>+    test:do_catchsql_test(test_name, test_itself,</div>
        <div>+                          {1, "LIKE or GLOB pattern can
          only contain UTF-8 characters"})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "3." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'abc' LIKE 'abc" ..
          tested_string .. "';"</div>
        <div>+    test:do_catchsql_test(test_name, test_itself,</div>
        <div>+                          {1, "LIKE or GLOB pattern can
          only contain UTF-8 characters"})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "4." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string
          .. "c';"</div>
        <div>+    test:do_catchsql_test(test_name, test_itself,</div>
        <div>+                          {1, "LIKE or GLOB pattern can
          only contain UTF-8 characters"})</div>
        <div>+</div>
        <div>+    -- Just skipping if row value predicand contains
          invalid character.</div>
      </div>
    </blockquote>
    What the predicand is? Is it a typo?<br>
    <blockquote type="cite"
cite="mid:CAEi+_armUY+G+KHR164aCGoEU759TocWOJEO01mXSDZ3wyuttA@mail.gmail.com">
      <div dir="ltr">
        <div>+</div>
        <div>+    test_name = prefix .. "5." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'ab" .. tested_string .. "' LIKE
          'abc';"</div>
        <div>+    test:do_execsql_test(test_name, test_itself, {0})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "6." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'abc" .. tested_string .. "'
          LIKE 'abc';"</div>
        <div>+    test:do_execsql_test(test_name, test_itself, {0})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "7." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'ab" .. tested_string .. "c'
          LIKE 'abc';"</div>
        <div>+    test:do_execsql_test(test_name, test_itself, {0})</div>
        <div>+end</div>
        <div>+</div>
        <div>+-- Valid testcases.</div>
        <div>+for i, tested_string in ipairs(valid_testcases) do</div>
        <div>+    test_name = prefix .. "8." .. tostring(i)</div>
        <div>+    local test_itself = "SELECT 'abc' LIKE 'ab" ..
          tested_string .. "';"</div>
        <div>+    test:do_execsql_test(test_name, test_itself, {0})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "9." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'abc' LIKE 'abc" ..
          tested_string .. "';"</div>
        <div>+    test:do_execsql_test(test_name, test_itself, {0})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "10." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string
          .. "c';"</div>
        <div>+    test:do_execsql_test(test_name,<span style="white-space:pre">	</span>test_itself,
          {0})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "11." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'ab" .. tested_string .. "' LIKE
          'abc';"</div>
        <div>+    test:do_execsql_test(test_name,<span style="white-space:pre">	</span>test_itself,
          {0})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "12." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'abc" .. tested_string .. "'
          LIKE 'abc';"</div>
        <div>+    test:do_execsql_test(test_name, test_itself, {0})</div>
        <div>+</div>
        <div>+    test_name = prefix .. "13." .. tostring(i)</div>
        <div>+    test_itself = "SELECT 'ab" .. tested_string .. "c'
          LIKE 'abc';"</div>
        <div>+    test:do_execsql_test(test_name, test_itself, {0})</div>
        <div>+end</div>
        <div>+</div>
        <div>+test:finish_test()</div>
      </div>
    </blockquote>
    Why I cannot find a test of `GLOB`? Even if we delete it in the
    future, it should be tested. You can write much less tests for glob.<br>
    E.g. this<br>
    ```<br>
    select '1' glob '[0-4]';<br>
    ```<br>
    somewhy returns 0.<br>
    <br>
    Sorry, some of the tests I ask you to write are a little out of
    scope of the ticket and they should already have been written.<br>
    But I suppose most of ambiguity should be clarified now. This ticket
    has raised important questions related to those functions.<br>
  </body>
</html>

--------------B332719D4B32C324E9E80FCF--