From: Alexander Turenko <alexander.turenko@tarantool.org> To: Nikita Tatunov <hollow653@gmail.com>, Alex Khatskevich <avkhatskevich@tarantool.org> Cc: tarantool-patches@freelists.org Subject: [tarantool-patches] Re: [PATCH] sql: LIKE & GLOB pattern comparison issue Date: Wed, 18 Jul 2018 05:43:14 +0300 [thread overview] Message-ID: <20180718024314.be245cmsgklxuvnk@tkn_work_nb> (raw) In-Reply-To: <1530190036-10105-1-git-send-email-hollow653@gmail.com> i Nikita! I don't have objections against this patch in general, just several minor comments and side notes. Alexey, as I see you are familiar with the code. Can you give a review for the patch? WBR, Alexander Turenko. On Thu, Jun 28, 2018 at 03:47:16PM +0300, N.Tatunov wrote: > Currently function that compares pattern and > string for GLOB & LIKE operators doesn't work properly. > It uses ICU reading function which perhaps was working > differently before and the implementation for the > comparison ending isn't paying attention to some > special cases, hence in those cases it works improperly. > Now the checks for comparison should work fine. > > Сloses: #3251 > Сloses: #3334 Use 'Closes #xxxx' form. Don't sure the form with a colon will work. Utf8Read macro comment now completely irrelevant to its definition. Compare to sqlite3: 584 /* 585 ** For LIKE and GLOB matching on EBCDIC machines, assume that every 586 ** character is exactly one byte in size. Also, provde the Utf8Read() 587 ** macro for fast reading of the next character in the common case where 588 ** the next character is ASCII. 589 */ 590 #if defined(SQLITE_EBCDIC) 591 # define sqlite3Utf8Read(A) (*((*A)++)) 592 # define Utf8Read(A) (*(A++)) 593 #else 594 # define Utf8Read(A) (A[0]<0x80?*(A++):sqlite3Utf8Read(&A)) 595 #endif > --- > src/box/sql/func.c | 25 ++++---- > test/sql-tap/like1.test.lua | 152 ++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 165 insertions(+), 12 deletions(-) > create mode 100755 test/sql-tap/like1.test.lua > > diff --git a/src/box/sql/func.c b/src/box/sql/func.c > index c06e3bd..dcbd7e0 100644 > --- a/src/box/sql/func.c > +++ b/src/box/sql/func.c > @@ -643,6 +643,7 @@ static const struct compareInfo likeInfoAlt = { '%', '_', 0, 0 }; > #define SQLITE_MATCH 0 > #define SQLITE_NOMATCH 1 > #define SQLITE_NOWILDCARDMATCH 2 > +#define SQL_NO_SYMBOLS_LEFT 65535 > 0xffff looks better. I would use name like ICU_ERROR. By the way, it can mean not only end of a string, but also other errors. It is okay to give 'not matched' in the case, I think. Hmm, we can misinterpret some error and give 'matched' result, I guess... maybe we really should check start < end in the macro to distinguish this cases. Don't sure about the test case. As I see from ICU source code it can be some internal buffer overflow error. We can do the check like so: -#define Utf8Read(s, e) ucnv_getNextUChar(pUtf8conv, &s, e, &status) +#define Utf8Read(s, e) (((s) < (e)) ? ucnv_getNextUChar(pUtf8conv, &(s), (e), \ + &(status)) : 0) -#define SQL_NO_SYMBOLS_LEFT 65535 +#define SQL_NO_SYMBOLS_LEFT 0 > /* > * Compare two UTF-8 strings for equality where the first string is > @@ -698,29 +699,28 @@ patternCompare(const char * pattern, /* The glob pattern */ > const char * string_end = string + strlen(string); > UErrorCode status = U_ZERO_ERROR; > > - while (pattern < pattern_end){ > - c = Utf8Read(pattern, pattern_end); > + while ((c = Utf8Read(pattern, pattern_end)) != SQL_NO_SYMBOLS_LEFT) { Here and everywhere below we lean on the fact that ucnv_getNextUChar compares pointers and that is so: 1860 s=*source; 1861 if(sourceLimit<s) { 1862 *err=U_ILLEGAL_ARGUMENT_ERROR; 1863 return 0xffff; 1864 } By the way, the patch reverts partially implemented approach to preliminary check start < end introduced in 198d59ce93. I think it is okay too, because some checks were missed and now they exist again. > diff --git a/test/sql-tap/like1.test.lua b/test/sql-tap/like1.test.lua > new file mode 100755 > index 0000000..42b4d43 > --- /dev/null > +++ b/test/sql-tap/like1.test.lua > @@ -0,0 +1,152 @@ > +#!/usr/bin/env tarantool > +test = require("sqltester") > +test:plan(13) > + Maybe case with % at the end? I tried, that works, but it would be good to have a regression test.
next prev parent reply other threads:[~2018-07-18 2:43 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-06-28 12:47 [tarantool-patches] " N.Tatunov 2018-06-28 12:54 ` [tarantool-patches] " Hollow111 2018-07-18 2:43 ` Alexander Turenko [this message] 2018-07-18 5:51 ` Alex Khatskevich 2018-07-18 15:24 ` Nikita Tatunov 2018-07-18 15:53 ` Alex Khatskevich 2018-07-18 15:57 ` Nikita Tatunov 2018-07-18 17:10 ` Alexander Turenko 2018-07-19 11:14 ` Nikita Tatunov 2018-07-19 11:56 ` Alex Khatskevich 2018-07-27 11:28 ` Nikita Tatunov 2018-07-27 13:06 ` Alexander Turenko 2018-07-27 19:11 ` Nikita Tatunov 2018-07-27 20:22 ` Alexander Turenko 2018-07-31 13:27 ` Nikita Tatunov 2018-07-31 13:47 ` Alexander Turenko 2018-08-01 10:35 ` Nikita Tatunov 2018-08-01 10:51 ` Nikita Tatunov 2018-08-01 13:56 ` Alex Khatskevich 2018-08-01 18:10 ` Nikita Tatunov 2018-08-01 18:14 ` Nikita Tatunov 2018-08-08 12:38 ` Alex Khatskevich
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180718024314.be245cmsgklxuvnk@tkn_work_nb \ --to=alexander.turenko@tarantool.org \ --cc=avkhatskevich@tarantool.org \ --cc=hollow653@gmail.com \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH] sql: LIKE & GLOB pattern comparison issue' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox