<div dir="ltr">Hello, Alex!<div>Thanks for the review!</div><div><br><div class="gmail_quote"><div dir="ltr">чт, 19 июл. 2018 г. в 14:56, Alex Khatskevich <<a href="mailto:avkhatskevich@tarantool.org">avkhatskevich@tarantool.org</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Hi.</p>
<p>I have some questions related to the func.c file, however before
that I would ask you to fix tests.</p>
<p>General ideas:</p>
1. Those tests are regresson tests (it just tests that problem will
not appear in the future).<br>
We name those tests in the following manear:
gh-XXXX-short-description.test.lua<br></div></blockquote><div><br></div><div>Done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
2. The thing you test is not related to a table and other columns.<br>
Please, convert the tests to the next format: {[[select '' like
'_B';]], {1}]]}.<br>
To make it more readable, you can do it like `like_testcases` in
`sql-tap/collation.test.lua`.<br></div></blockquote><div><br></div><div>Done.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
3. There is two extra things that should be tested:<br>
1. When string or pattern ends with incorrect unicode symbol
(e.g. half of the whole unicode symbol)<br>
2. String or pattern contains incorrect unicode symbol.<br>
<br></div></blockquote><div><br></div><div>Refactored test with this idea taken into account.</div><div> </div><div>Now comparing function is only supposed to work with TEXT types,</div><div>which led to the part of #3572 propositions.</div><div><br></div><div>Also added error output in case pattern contains invalid symbol &</div><div>now if string contains invalid symbol it can't be matched with whatever</div><div>pattern. Some minor fixes to patternCompare function as well.</div><div><br></div><div>Here's diff:<br><br></div><div><div>diff --git a/src/box/sql/func.c b/src/box/sql/func.c</div><div>index c06e3bd..5b53076 100644</div><div>--- a/src/box/sql/func.c</div><div>+++ b/src/box/sql/func.c</div><div>@@ -617,13 +617,17 @@ struct compareInfo {</div><div> <span style="white-space:pre"> </span>u8 noCase;<span style="white-space:pre"> </span>/* true to ignore case differences */</div><div> };</div><div> </div><div>-/*</div><div>- * For LIKE and GLOB matching on EBCDIC machines, assume that every</div><div>- * character is exactly one byte in size. Also, provde the Utf8Read()</div><div>- * macro for fast reading of the next character in the common case where</div><div>- * the next character is ASCII.</div><div>+/**</div><div>+ * Providing there are symbols in string s this</div><div>+ * macro returns UTF-8 code of character and</div><div>+ * promotes pointer to the next symbol in the string.</div><div>+ * Otherwise return code is SQL_END_OF_STRING.</div><div> */</div><div>-#define Utf8Read(s, e) ucnv_getNextUChar(pUtf8conv, &s, e, &status)</div><div>+#define Utf8Read(s, e) (((s) < (e)) ?\</div><div>+<span style="white-space:pre"> </span>ucnv_getNextUChar(pUtf8conv, &(s), (e), &(status)) : 0)</div><div>+</div><div>+#define SQL_END_OF_STRING 0</div><div>+#define SQL_INVALID_UTF8_SYMBOL 0xfffd</div><div> </div><div> static const struct compareInfo globInfo = { '*', '?', '[', 0 };</div><div> </div><div>@@ -643,51 +647,61 @@ static const struct compareInfo likeInfoAlt = { '%', '_', 0, 0 };</div><div> #define SQLITE_MATCH 0</div><div> #define SQLITE_NOMATCH 1</div><div> #define SQLITE_NOWILDCARDMATCH 2</div><div>+#define SQL_PROHIBITED_PATTERN 3</div><div> </div><div>-/*</div><div>- * Compare two UTF-8 strings for equality where the first string is</div><div>- * a GLOB or LIKE expression. Return values:</div><div>- *</div><div>- * SQLITE_MATCH: Match</div><div>- * SQLITE_NOMATCH: No match</div><div>- * SQLITE_NOWILDCARDMATCH: No match in spite of having * or % wildcards.</div><div>+/**</div><div>+ * Compare two UTF-8 strings for equality where the first string</div><div>+ * is a GLOB or LIKE expression.</div><div> *</div><div> * Globbing rules:</div><div> *</div><div>- * '*' Matches any sequence of zero or more characters.</div><div>+ * '*' Matches any sequence of zero or more characters.</div><div> *</div><div>- * '?' Matches exactly one character.</div><div>+ * '?' Matches exactly one character.</div><div> *</div><div>- * [...] Matches one character from the enclosed list of</div><div>- * characters.</div><div>+ * [...] Matches one character from the enclosed list of</div><div>+ * characters.</div><div> *</div><div>- * [^...] Matches one character not in the enclosed list.</div><div>+ * [^...] Matches one character not in the enclosed list.</div><div> *</div><div>- * With the [...] and [^...] matching, a ']' character can be included</div><div>- * in the list by making it the first character after '[' or '^'. A</div><div>- * range of characters can be specified using '-'. Example:</div><div>- * "[a-z]" matches any single lower-case letter. To match a '-', make</div><div>- * it the last character in the list.</div><div>+ * With the [...] and [^...] matching, a ']' character can be</div><div>+ * included in the list by making it the first character after</div><div>+ * '[' or '^'. A range of characters can be specified using '-'.</div><div>+ * Example: "[a-z]" matches any single lower-case letter.</div><div>+ * To match a '-', make it the last character in the list.</div><div> *</div><div> * Like matching rules:</div><div> *</div><div>- * '%' Matches any sequence of zero or more characters</div><div>+ * '%' Matches any sequence of zero or more characters.</div><div> *</div><div>- ** '_' Matches any one character</div><div>+ ** '_' Matches any one character.</div><div> *</div><div> * Ec Where E is the "esc" character and c is any other</div><div>- * character, including '%', '_', and esc, match exactly c.</div><div>+ * character, including '%', '_', and esc, match</div><div>+ * exactly c.</div><div> *</div><div> * The comments within this routine usually assume glob matching.</div><div> *</div><div>- * This routine is usually quick, but can be N**2 in the worst case.</div><div>+ * This routine is usually quick, but can be N**2 in the worst</div><div>+ * case.</div><div>+ *</div><div>+ * @param pattern String containing comparison pattern.</div><div>+ * @param string String being compared.</div><div>+ * @param compareInfo Information about how to compare.</div><div>+ * @param matchOther The escape char (LIKE) or '[' (GLOB).</div><div>+ *</div><div>+ * @retval SQLITE_MATCH: Match.</div><div>+ *<span style="white-space:pre"> </span> SQLITE_NOMATCH: No match.</div><div>+ *<span style="white-space:pre"> </span> SQLITE_NOWILDCARDMATCH: No match in spite of having *</div><div>+ *<span style="white-space:pre"> </span> or % wildcards.</div><div>+ *<span style="white-space:pre"> </span> SQL_PROHIBITED_PATTERN Pattern contains invalid</div><div>+ *<span style="white-space:pre"> </span> symbol.</div><div> */</div><div> static int</div><div>-patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div>-<span style="white-space:pre"> </span> const char * string,<span style="white-space:pre"> </span>/* The string to compare against the glob */</div><div>-<span style="white-space:pre"> </span> const struct compareInfo *pInfo,<span style="white-space:pre"> </span>/* Information about how to do the compare */</div><div>-<span style="white-space:pre"> </span> UChar32 matchOther<span style="white-space:pre"> </span>/* The escape char (LIKE) or '[' (GLOB) */</div><div>- )</div><div>+sql_utf8_pattern_compare(const char * pattern,</div><div>+<span style="white-space:pre"> </span> const char * string,</div><div>+<span style="white-space:pre"> </span> const struct compareInfo *pInfo,</div><div>+<span style="white-space:pre"> </span> UChar32 matchOther)</div><div> {</div><div> <span style="white-space:pre"> </span>UChar32 c, c2;<span style="white-space:pre"> </span>/* Next pattern and input string chars */</div><div> <span style="white-space:pre"> </span>UChar32 matchOne = pInfo->matchOne;<span style="white-space:pre"> </span>/* "?" or "_" */</div><div>@@ -698,29 +712,41 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span>const char * string_end = string + strlen(string);</div><div> <span style="white-space:pre"> </span>UErrorCode status = U_ZERO_ERROR;</div><div> </div><div>-<span style="white-space:pre"> </span>while (pattern < pattern_end){</div><div>-<span style="white-space:pre"> </span>c = Utf8Read(pattern, pattern_end);</div><div>+<span style="white-space:pre"> </span>while ((c = Utf8Read(pattern, pattern_end)) != SQL_END_OF_STRING) {</div><div>+<span style="white-space:pre"> </span>if (c == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div> <span style="white-space:pre"> </span>if (c == matchAll) {<span style="white-space:pre"> </span>/* Match "*" */</div><div> <span style="white-space:pre"> </span>/* Skip over multiple "*" characters in the pattern. If there</div><div> <span style="white-space:pre"> </span> * are also "?" characters, skip those as well, but consume a</div><div> <span style="white-space:pre"> </span> * single character of the input string for each "?" skipped</div><div> <span style="white-space:pre"> </span> */</div><div>-<span style="white-space:pre"> </span>while (pattern < pattern_end){</div><div>-<span style="white-space:pre"> </span>c = Utf8Read(pattern, pattern_end);</div><div>+<span style="white-space:pre"> </span>while ((c = Utf8Read(pattern, pattern_end)) !=</div><div>+<span style="white-space:pre"> </span> SQL_END_OF_STRING) {</div><div>+<span style="white-space:pre"> </span>if (c == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div> <span style="white-space:pre"> </span>if (c != matchAll && c != matchOne)</div><div> <span style="white-space:pre"> </span>break;</div><div>-<span style="white-space:pre"> </span>if (c == matchOne</div><div>-<span style="white-space:pre"> </span> && Utf8Read(string, string_end) == 0) {</div><div>+<span style="white-space:pre"> </span>if (c == matchOne &&</div><div>+<span style="white-space:pre"> </span> (c2 = Utf8Read(string, string_end)) ==</div><div>+<span style="white-space:pre"> </span> SQL_END_OF_STRING)</div><div> <span style="white-space:pre"> </span>return SQLITE_NOWILDCARDMATCH;</div><div>-<span style="white-space:pre"> </span>}</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>/* "*" at the end of the pattern matches */</div><div>-<span style="white-space:pre"> </span>if (pattern == pattern_end)</div><div>+<span style="white-space:pre"> </span>if (c == SQL_END_OF_STRING) {</div><div>+<span style="white-space:pre"> </span>while ((c2 = Utf8Read(string, string_end)) !=</div><div>+<span style="white-space:pre"> </span> SQL_END_OF_STRING)</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>return SQLITE_MATCH;</div><div>+<span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>if (c == matchOther) {</div><div> <span style="white-space:pre"> </span>if (pInfo->matchSet == 0) {</div><div> <span style="white-space:pre"> </span>c = Utf8Read(pattern, pattern_end);</div><div>-<span style="white-space:pre"> </span>if (c == 0)</div><div>+<span style="white-space:pre"> </span>if (c == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div>+<span style="white-space:pre"> </span>if (c == SQL_END_OF_STRING)</div><div> <span style="white-space:pre"> </span>return SQLITE_NOWILDCARDMATCH;</div><div> <span style="white-space:pre"> </span>} else {</div><div> <span style="white-space:pre"> </span>/* "[...]" immediately follows the "*". We have to do a slow</div><div>@@ -729,13 +755,16 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span>assert(matchOther < 0x80);<span style="white-space:pre"> </span>/* '[' is a single-byte character */</div><div> <span style="white-space:pre"> </span>while (string < string_end) {</div><div> <span style="white-space:pre"> </span>int bMatch =</div><div>-<span style="white-space:pre"> </span> patternCompare(&pattern[-1],</div><div>-<span style="white-space:pre"> </span> string,</div><div>-<span style="white-space:pre"> </span> pInfo,</div><div>-<span style="white-space:pre"> </span> matchOther);</div><div>+<span style="white-space:pre"> </span> sql_utf8_pattern_compare(</div><div>+<span style="white-space:pre"> </span>&pattern[-1],</div><div>+<span style="white-space:pre"> </span>string,</div><div>+<span style="white-space:pre"> </span>pInfo,</div><div>+<span style="white-space:pre"> </span>matchOther);</div><div> <span style="white-space:pre"> </span>if (bMatch != SQLITE_NOMATCH)</div><div> <span style="white-space:pre"> </span>return bMatch;</div><div>-<span style="white-space:pre"> </span>Utf8Read(string, string_end);</div><div>+<span style="white-space:pre"> </span>c = Utf8Read(string, string_end);</div><div>+<span style="white-space:pre"> </span>if (c == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>return SQLITE_NOWILDCARDMATCH;</div><div> <span style="white-space:pre"> </span>}</div><div>@@ -764,6 +793,8 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span> * languages.</div><div> <span style="white-space:pre"> </span> */</div><div> <span style="white-space:pre"> </span>c2 = Utf8Read(string, string_end);</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>if (!noCase) {</div><div> <span style="white-space:pre"> </span>if (c2 != c)</div><div> <span style="white-space:pre"> </span>continue;</div><div>@@ -771,9 +802,10 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span>if (c2 != c && u_tolower(c2) != c)</div><div> <span style="white-space:pre"> </span>continue;</div><div> <span style="white-space:pre"> </span>}</div><div>-<span style="white-space:pre"> </span>bMatch =</div><div>-<span style="white-space:pre"> </span> patternCompare(pattern, string,</div><div>-<span style="white-space:pre"> </span> pInfo, matchOther);</div><div>+<span style="white-space:pre"> </span>bMatch = sql_utf8_pattern_compare(pattern,</div><div>+<span style="white-space:pre"> </span> string,</div><div>+<span style="white-space:pre"> </span> pInfo,</div><div>+<span style="white-space:pre"> </span> matchOther);</div><div> <span style="white-space:pre"> </span>if (bMatch != SQLITE_NOMATCH)</div><div> <span style="white-space:pre"> </span>return bMatch;</div><div> <span style="white-space:pre"> </span>}</div><div>@@ -782,7 +814,9 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span>if (c == matchOther) {</div><div> <span style="white-space:pre"> </span>if (pInfo->matchSet == 0) {</div><div> <span style="white-space:pre"> </span>c = Utf8Read(pattern, pattern_end);</div><div>-<span style="white-space:pre"> </span>if (c == 0)</div><div>+<span style="white-space:pre"> </span>if (c == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div>+<span style="white-space:pre"> </span>if (c == SQL_END_OF_STRING)</div><div> <span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>zEscaped = pattern;</div><div> <span style="white-space:pre"> </span>} else {</div><div>@@ -790,23 +824,33 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span>int seen = 0;</div><div> <span style="white-space:pre"> </span>int invert = 0;</div><div> <span style="white-space:pre"> </span>c = Utf8Read(string, string_end);</div><div>+<span style="white-space:pre"> </span>if (c == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>if (string == string_end)</div><div> <span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>c2 = Utf8Read(pattern, pattern_end);</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div> <span style="white-space:pre"> </span>if (c2 == '^') {</div><div> <span style="white-space:pre"> </span>invert = 1;</div><div> <span style="white-space:pre"> </span>c2 = Utf8Read(pattern, pattern_end);</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div> <span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>if (c2 == ']') {</div><div> <span style="white-space:pre"> </span>if (c == ']')</div><div> <span style="white-space:pre"> </span>seen = 1;</div><div> <span style="white-space:pre"> </span>c2 = Utf8Read(pattern, pattern_end);</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div> <span style="white-space:pre"> </span>}</div><div>-<span style="white-space:pre"> </span>while (c2 && c2 != ']') {</div><div>+<span style="white-space:pre"> </span>while (c2 != SQL_END_OF_STRING && c2 != ']') {</div><div> <span style="white-space:pre"> </span>if (c2 == '-' && pattern[0] != ']'</div><div> <span style="white-space:pre"> </span> && pattern < pattern_end</div><div> <span style="white-space:pre"> </span> && prior_c > 0) {</div><div> <span style="white-space:pre"> </span>c2 = Utf8Read(pattern, pattern_end);</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div> <span style="white-space:pre"> </span>if (c >= prior_c && c <= c2)</div><div> <span style="white-space:pre"> </span>seen = 1;</div><div> <span style="white-space:pre"> </span>prior_c = 0;</div><div>@@ -817,14 +861,19 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span>prior_c = c2;</div><div> <span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>c2 = Utf8Read(pattern, pattern_end);</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQL_PROHIBITED_PATTERN;</div><div> <span style="white-space:pre"> </span>}</div><div>-<span style="white-space:pre"> </span>if (pattern == pattern_end || (seen ^ invert) == 0) {</div><div>+<span style="white-space:pre"> </span>if (pattern == pattern_end ||</div><div>+<span style="white-space:pre"> </span> (seen ^ invert) == 0) {</div><div> <span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>continue;</div><div> <span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>c2 = Utf8Read(string, string_end);</div><div>+<span style="white-space:pre"> </span>if (c2 == SQL_INVALID_UTF8_SYMBOL)</div><div>+<span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>if (c == c2)</div><div> <span style="white-space:pre"> </span>continue;</div><div> <span style="white-space:pre"> </span>if (noCase){</div><div>@@ -839,7 +888,8 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> <span style="white-space:pre"> </span> c == u_tolower(c2))</div><div> <span style="white-space:pre"> </span>continue;</div><div> <span style="white-space:pre"> </span>}</div><div>-<span style="white-space:pre"> </span>if (c == matchOne && pattern != zEscaped && c2 != 0)</div><div>+<span style="white-space:pre"> </span>if (c == matchOne && pattern != zEscaped &&</div><div>+<span style="white-space:pre"> </span> c2 != SQL_END_OF_STRING)</div><div> <span style="white-space:pre"> </span>continue;</div><div> <span style="white-space:pre"> </span>return SQLITE_NOMATCH;</div><div> <span style="white-space:pre"> </span>}</div><div>@@ -853,8 +903,7 @@ patternCompare(const char * pattern,<span style="white-space:pre"> </span>/* The glob pattern */</div><div> int</div><div> sqlite3_strglob(const char *zGlobPattern, const char *zString)</div><div> {</div><div>-<span style="white-space:pre"> </span>return patternCompare(zGlobPattern, zString, &globInfo,</div><div>-<span style="white-space:pre"> </span> '[');</div><div>+<span style="white-space:pre"> </span>return sql_utf8_pattern_compare(zGlobPattern, zString, &globInfo, '[');</div><div> }</div><div> </div><div> /*</div><div>@@ -864,7 +913,7 @@ sqlite3_strglob(const char *zGlobPattern, const char *zString)</div><div> int</div><div> sqlite3_strlike(const char *zPattern, const char *zStr, unsigned int esc)</div><div> {</div><div>-<span style="white-space:pre"> </span>return patternCompare(zPattern, zStr, &likeInfoNorm, esc);</div><div>+<span style="white-space:pre"> </span>return sql_utf8_pattern_compare(zPattern, zStr, &likeInfoNorm, esc);</div><div> }</div><div> </div><div> /*</div><div>@@ -910,8 +959,9 @@ likeFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)</div><div> <span style="white-space:pre"> </span>zB = (const char *) sqlite3_value_text(argv[0]);</div><div> <span style="white-space:pre"> </span>zA = (const char *) sqlite3_value_text(argv[1]);</div><div> </div><div>-<span style="white-space:pre"> </span>/* Limit the length of the LIKE or GLOB pattern to avoid problems</div><div>-<span style="white-space:pre"> </span> * of deep recursion and N*N behavior in patternCompare().</div><div>+<span style="white-space:pre"> </span>/* Limit the length of the LIKE or GLOB pattern to avoid</div><div>+<span style="white-space:pre"> </span> * problems of deep recursion and N*N behavior in</div><div>+<span style="white-space:pre"> </span> * sql_utf8_pattern_compare().</div><div> <span style="white-space:pre"> </span> */</div><div> <span style="white-space:pre"> </span>nPat = sqlite3_value_bytes(argv[0]);</div><div> <span style="white-space:pre"> </span>testcase(nPat == db->aLimit[SQLITE_LIMIT_LIKE_PATTERN_LENGTH]);</div><div>@@ -947,7 +997,12 @@ likeFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)</div><div> <span style="white-space:pre"> </span>sqlite3_like_count++;</div><div> #endif</div><div> <span style="white-space:pre"> </span>int res;</div><div>-<span style="white-space:pre"> </span>res = patternCompare(zB, zA, pInfo, escape);</div><div>+<span style="white-space:pre"> </span>res = sql_utf8_pattern_compare(zB, zA, pInfo, escape);</div><div>+<span style="white-space:pre"> </span>if (res == SQL_PROHIBITED_PATTERN) {</div><div>+<span style="white-space:pre"> </span>sqlite3_result_error(context, "LIKE or GLOB pattern can only"</div><div>+<span style="white-space:pre"> </span> " contain UTF-8 characters", -1);</div><div>+<span style="white-space:pre"> </span>return;</div><div>+<span style="white-space:pre"> </span>}</div><div> <span style="white-space:pre"> </span>sqlite3_result_int(context, res == SQLITE_MATCH);</div><div> }</div><div> </div><div>diff --git a/test/sql-tap/e_expr.test.lua b/test/sql-tap/e_expr.test.lua</div><div>index 13d3a96..051210a 100755</div><div>--- a/test/sql-tap/e_expr.test.lua</div><div>+++ b/test/sql-tap/e_expr.test.lua</div><div>@@ -1,6 +1,6 @@</div><div> #!/usr/bin/env tarantool</div><div> test = require("sqltester")</div><div>-test:plan(12431)</div><div>+test:plan(10665)</div><div> </div><div> --!./tcltestrunner.lua</div><div> -- 2010 July 16</div><div>@@ -77,8 +77,10 @@ local operations = {</div><div> {"<>", "ne1"},</div><div> {"!=", "ne2"},</div><div> {"IS", "is"},</div><div>- {"LIKE", "like"},</div><div>- {"GLOB", "glob"},</div><div>+-- NOTE: This test needs refactoring after deletion of GLOB &</div><div>+--<span style="white-space:pre"> </span> type restrictions for LIKE.</div><div>+-- {"LIKE", "like"},</div><div>+-- {"GLOB", "glob"},</div><div> {"AND", "and"},</div><div> {"OR", "or"},</div><div> {"MATCH", "match"},</div><div>@@ -96,7 +98,7 @@ operations = {</div><div> {"+", "-"},</div><div> {"<<", ">>", "&", "|"},</div><div> {"<", "<=", ">", ">="},</div><div>- {"=", "==", "!=", "<>", "LIKE", "GLOB"}, --"MATCH", "REGEXP"},</div><div>+ {"=", "==", "!=", "<>"}, --"LIKE", "GLOB"}, "MATCH", "REGEXP"},</div><div> {"AND"},</div><div> {"OR"},</div><div> }</div><div>@@ -475,6 +477,7 @@ for _, op in ipairs(oplist) do</div><div> end</div><div> end</div><div> end</div><div>+</div><div> ---------------------------------------------------------------------------</div><div> -- Test the IS and IS NOT operators.</div><div> --</div><div>diff --git a/test/sql-tap/gh-3251-string-pattern-comparison.lua b/test/sql-tap/gh-3251-string-pattern-comparison.lua</div><div>new file mode 100755</div><div>index 0000000..0202efc</div><div>--- /dev/null</div><div>+++ b/test/sql-tap/gh-3251-string-pattern-comparison.lua</div><div>@@ -0,0 +1,238 @@</div><div>+#!/usr/bin/env tarantool</div><div>+test = require("sqltester")</div><div>+test:plan(106)</div><div>+</div><div>+local prefix = "like-test-"</div><div>+</div><div>+-- Unciode byte sequences.</div><div>+</div><div>+local valid_testcases = {</div><div>+<span style="white-space:pre"> </span>'\x01',</div><div>+<span style="white-space:pre"> </span>'\x09',</div><div>+<span style="white-space:pre"> </span>'\x1F',</div><div>+<span style="white-space:pre"> </span>'\x7F',</div><div>+<span style="white-space:pre"> </span>'\xC2\x80',</div><div>+<span style="white-space:pre"> </span>'\xC2\x90',</div><div>+<span style="white-space:pre"> </span>'\xC2\x9F',</div><div>+<span style="white-space:pre"> </span>'\xE2\x80\xA8',</div><div>+<span style="white-space:pre"> </span>'\x20\x0B',</div><div>+<span style="white-space:pre"> </span>'\xE2\x80\xA9',</div><div>+}</div><div>+</div><div>+-- Non-Unicode byte sequences.</div><div>+local invalid_testcases = {</div><div>+<span style="white-space:pre"> </span>'\xE2\x80',</div><div>+<span style="white-space:pre"> </span>'\xFE\xFF',</div><div>+<span style="white-space:pre"> </span>'\xC2',</div><div>+<span style="white-space:pre"> </span>'\xED\xB0\x80',</div><div>+<span style="white-space:pre"> </span>'\xD0',</div><div>+}</div><div>+</div><div>+local like_test_cases =</div><div>+{</div><div>+<span style="white-space:pre"> </span>{"1.1",</div><div>+<span style="white-space:pre"> </span>[[</div><div>+<span style="white-space:pre"> </span>CREATE TABLE t2 (column1 INTEGER,</div><div>+<span style="white-space:pre"> </span> column2 VARCHAR(100),</div><div>+<span style="white-space:pre"> </span> column3 BLOB,</div><div>+<span style="white-space:pre"> </span> column4 FLOAT,</div><div>+<span style="white-space:pre"> </span> PRIMARY KEY (column1, column2));</div><div>+<span style="white-space:pre"> </span>INSERT INTO t2 VALUES (1, 'AB', X'4142', 5.5);</div><div>+<span style="white-space:pre"> </span>INSERT INTO t2 VALUES (1, 'CD', X'2020', 1E4);</div><div>+<span style="white-space:pre"> </span>INSERT INTO t2 VALUES (2, 'AB', X'2020', 12.34567);</div><div>+<span style="white-space:pre"> </span>INSERT INTO t2 VALUES (-1000, '', X'', 0.0);</div><div>+<span style="white-space:pre"> </span>CREATE TABLE t1 (a INT PRIMARY KEY, str VARCHAR(100));</div><div>+<span style="white-space:pre"> </span>INSERT INTO t1 VALUES (1, 'ab');</div><div>+<span style="white-space:pre"> </span>INSERT INTO t1 VALUES (2, 'abCDF');</div><div>+<span style="white-space:pre"> </span>INSERT INTO t1 VALUES (3, 'CDF');</div><div>+<span style="white-space:pre"> </span>CREATE TABLE t (s1 CHAR(2) PRIMARY KEY, s2 CHAR(2));</div><div>+<span style="white-space:pre"> </span>INSERT INTO t VALUES ('AB', 'AB');</div><div>+<span style="white-space:pre"> </span>]], {0}},</div><div>+<span style="white-space:pre"> </span>{"1.2",</div><div>+<span style="white-space:pre"> </span>[[</div><div>+<span style="white-space:pre"> </span>SELECT column1, column2, column1 * column4 FROM</div><div>+<span style="white-space:pre"> </span>t2 WHERE column2 LIKE '_B';</div><div>+<span style="white-space:pre"> </span>]],</div><div>+<span style="white-space:pre"> </span>{0, {1, 'AB', 5.5, 2, 'AB', 24.69134}} },</div><div>+<span style="white-space:pre"> </span>{"1.3",</div><div>+<span style="white-space:pre"> </span>"SELECT column1, column2 FROM t2 WHERE column2 LIKE '%B';",</div><div>+<span style="white-space:pre"> </span>{0, {1, 'AB', 2, 'AB'}} },</div><div>+<span style="white-space:pre"> </span>{"1.4",</div><div>+<span style="white-space:pre"> </span>"SELECT column1, column2 FROM t2 WHERE column2 LIKE 'A__';",</div><div>+<span style="white-space:pre"> </span>{0, {}} },</div><div>+<span style="white-space:pre"> </span>{"1.5",</div><div>+<span style="white-space:pre"> </span>"SELECT column1, column2 FROM t2 WHERE column2 LIKE 'A_';",</div><div>+<span style="white-space:pre"> </span>{0, {1, 'AB', 2, 'AB'}} },</div><div>+<span style="white-space:pre"> </span>{"1.6",</div><div>+<span style="white-space:pre"> </span>"SELECT column1, column2 FROM t2 WHERE column2 LIKE 'A';",</div><div>+<span style="white-space:pre"> </span>{0, {}} },</div><div>+<span style="white-space:pre"> </span>{"1.7",</div><div>+<span style="white-space:pre"> </span>"SELECT column1, column2 FROM t2 WHERE column2 LIKE '_';",</div><div>+<span style="white-space:pre"> </span>{0, {}} },</div><div>+<span style="white-space:pre"> </span>{"1.8",</div><div>+<span style="white-space:pre"> </span>"SELECT * FROM t WHERE s1 LIKE '%A';",</div><div>+<span style="white-space:pre"> </span>{0, {}} },</div><div>+<span style="white-space:pre"> </span>{"1.9",</div><div>+<span style="white-space:pre"> </span>"SELECT * FROM t WHERE s1 LIKE '%C';",</div><div>+<span style="white-space:pre"> </span>{0, {}} },</div><div>+<span style="white-space:pre"> </span>{"1.10",</div><div>+<span style="white-space:pre"> </span>"SELECT * FROM t1 WHERE str LIKE '%df';",</div><div>+<span style="white-space:pre"> </span>{0, {2, 'abCDF', 3, 'CDF'}} },</div><div>+<span style="white-space:pre"> </span>{"1.11",</div><div>+<span style="white-space:pre"> </span>"SELECT * FROM t1 WHERE str LIKE 'a_';",</div><div>+<span style="white-space:pre"> </span>{0, {1, 'ab'}} },</div><div>+<span style="white-space:pre"> </span>{"1.12",</div><div>+<span style="white-space:pre"> </span>"SELECT column1, column2 FROM t2 WHERE column2 LIKE '__';",</div><div>+<span style="white-space:pre"> </span>{0, {1, 'AB', 1, 'CD', 2, 'AB'}} },</div><div>+<span style="white-space:pre"> </span>{"1.13",</div><div>+<span style="white-space:pre"> </span>"SELECT str FROM t1 WHERE str LIKE 'ab%';",</div><div>+<span style="white-space:pre"> </span>{0, {'ab', 'abCDF'}} },</div><div>+<span style="white-space:pre"> </span>{"1.14",</div><div>+<span style="white-space:pre"> </span>"SELECT str FROM t1 WHERE str LIKE 'abC%';",</div><div>+<span style="white-space:pre"> </span>{0, {'abCDF'}} },</div><div>+<span style="white-space:pre"> </span>{"1.15",</div><div>+<span style="white-space:pre"> </span>"SELECT str FROM t1 WHERE str LIKE 'a_%';",</div><div>+<span style="white-space:pre"> </span>{0, {'ab', 'abCDF'}} },</div><div>+<span style="white-space:pre"> </span>{"1.16",</div><div>+<span style="white-space:pre"> </span>[[</div><div>+<span style="white-space:pre"> </span>DROP TABLE t1;</div><div>+<span style="white-space:pre"> </span>DROP TABLE t2;</div><div>+<span style="white-space:pre"> </span>DROP TABLE t;</div><div>+<span style="white-space:pre"> </span>]], {0}},</div><div>+}</div><div>+</div><div>+test:do_catchsql_set_test(like_test_cases, prefix)</div><div>+</div><div>+-- Invalid testcases.</div><div>+</div><div>+for i, tested_string in ipairs(invalid_testcases) do</div><div>+<span style="white-space:pre"> </span>local test_name = prefix .. "2." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>local test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "';"</div><div>+</div><div>+-- We should raise an error if pattern contains invalid characters.</div><div>+<span style="white-space:pre"> </span></div><div>+<span style="white-space:pre"> </span>test:do_catchsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>1, "LIKE or GLOB pattern can only contain UTF-8 characters"</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "3." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'abc' LIKE 'abc" .. tested_string .. "';"</div><div>+<span style="white-space:pre"> </span>test:do_catchsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>1, "LIKE or GLOB pattern can only contain UTF-8 characters"</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "4." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "c';"</div><div>+<span style="white-space:pre"> </span>test:do_catchsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>1, "LIKE or GLOB pattern can only contain UTF-8 characters"</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+-- Just skipping if row value predicand contains invalid character.</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "5." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'ab" .. tested_string .. "' LIKE 'abc';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "6." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'abc" .. tested_string .. "' LIKE 'abc';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "7." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'ab" .. tested_string .. "c' LIKE 'abc';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+end</div><div>+</div><div>+-- Valid testcases.</div><div>+</div><div>+for i, tested_string in ipairs(valid_testcases) do</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "8." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>local test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "9." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'abc' LIKE 'abc" .. tested_string .. "';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "10." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'abc' LIKE 'ab" .. tested_string .. "c';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "11." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'ab" .. tested_string .. "' LIKE 'abc';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "12." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'abc" .. tested_string .. "' LIKE 'abc';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+</div><div>+<span style="white-space:pre"> </span>test_name = prefix .. "13." .. tostring(i)</div><div>+<span style="white-space:pre"> </span>test_itself = "SELECT 'ab" .. tested_string .. "c' LIKE 'abc';"</div><div>+<span style="white-space:pre"> </span>test:do_execsql_test(</div><div>+<span style="white-space:pre"> </span>test_name,</div><div>+<span style="white-space:pre"> </span>test_itself, {</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>0</div><div>+<span style="white-space:pre"> </span>-- <test_name></div><div>+<span style="white-space:pre"> </span>})</div><div>+end</div><div>+</div><div>+test:finish_test()</div></div><div><br></div></div></div></div>