From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <tarantool-patches-bounce@freelists.org> Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 4A9082A7D8 for <tarantool-patches@freelists.org>; Tue, 11 Sep 2018 06:07:04 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X8brS_6AlyEd for <tarantool-patches@freelists.org>; Tue, 11 Sep 2018 06:07:04 -0400 (EDT) Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 5955628BE3 for <tarantool-patches@freelists.org>; Tue, 11 Sep 2018 06:07:03 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 1/2] sql: LIKE & GLOB pattern comparison issue References: <cover.1534436835.git.n.tatunov@tarantool.org> <43febf82af3702fadfea135db978ffb6426eb00d.1534436836.git.n.tatunov@tarantool.org> <d11b496b-1d77-c1ef-27dd-874835bee1b9@tarantool.org> <20180817111727.y6nsbblpm5nh4n3g@tkn_work_nb> <436d256a-f9d0-781f-8cad-179d7322c7bd@tarantool.org> <87897608-173E-45EB-80A1-8B249706D8A1@tarantool.org> <6a1352e9-425c-d656-1bec-bb04d9f0fee6@tarantool.org> <58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org> From: Alex Khatskevich <avkhatskevich@tarantool.org> Message-ID: <860a125b-19f3-3bf1-8705-25156ff508ab@tarantool.org> Date: Tue, 11 Sep 2018 13:06:57 +0300 MIME-Version: 1.0 In-Reply-To: <58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org> Content-Type: multipart/alternative; boundary="------------3DEA5F9B877AB120207F99D8" Content-Language: en-US Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: <mailto:ecartis@freelists.org?Subject=help> List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe> List-software: Ecartis version 1.0.0 List-Id: tarantool-patches <tarantool-patches.freelists.org> List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe> List-owner: <mailto:> List-post: <mailto:tarantool-patches@freelists.org> List-archive: <http://www.freelists.org/archives/tarantool-patches> To: Nikita Tatunov <n.tatunov@tarantool.org>, tarantool-patches@freelists.org Cc: Alexander Turenko <alexander.turenko@tarantool.org>, korablev@tarantool.org This is a multi-part message in MIME format. --------------3DEA5F9B877AB120207F99D8 Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit On 11.09.2018 09:06, Nikita Tatunov wrote: > > >> On 11 Sep 2018, at 01:20, Alex Khatskevich >> <avkhatskevich@tarantool.org <mailto:avkhatskevich@tarantool.org>> wrote: >> >>> >>> >>>> On 17 Aug 2018, at 14:42, Alex Khatskevich >>>> <avkhatskevich@tarantool.org <mailto:avkhatskevich@tarantool.org>> >>>> wrote: >>>> >>>> >>>> On 17.08.2018 14:17, Alexander Turenko wrote: >>>>> 0xffff is the result of 'end of a string' check as well as >>>>> internal buffer >>>>> overflow error. I have the relevant code pasted in the first review of >>>>> the patch (July, 18). >>>>> >>>>> // source/common/ucnv.c::ucnv_getNextUChar >>>>> 1860 s=*source; >>>>> 1861 if(sourceLimit<s) { >>>>> 1862 *err=U_ILLEGAL_ARGUMENT_ERROR; >>>>> 1863 return 0xffff; >>>>> 1864 } >>>>> >>>>> We should not handle the buffer overflow case as an invalid symbol. Of >>>>> course we should not handle it as the 'end of the string' situation. >>>>> Ideally we should perform pointer myself and raise an error in case of >>>>> 0xffff. I had thought that a buffer overflow error is unlikely to >>>>> meet, >>>>> but you are right: we should differentiate these situations. >>>>> >>>>> In one of the previous version of a patch we perform this check >>>>> like so: >>>>> >>>>> #define Utf8Read(s, e) (((s) < (e)) ?\ >>>>> ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0) >>>>> >>>>> Don't sure why it was changed. Maybe it is try to correctly handle >>>>> '\0' >>>>> symbol (it is valid unicode character)? >>>> The define you have pasted can return 0xffff. >>>> The reasons to change it back are described in the previous patchset. >>>> In short: >>>> 1. It is equivalent to >>>> a. check s < e in a while loop >>>> b. read next character inside of where loop body. >>>> 2. In some usages of the code this check (s<e) was redundant (it >>>> was performed a couple lines above) >>>> 3. There is no reason to rewrite the old version of this function. >>>> (So, we decided to use old version of the function) >>>>> So I see two ways to proceed: >>>>> >>>>> 1. Lean on icu's check and ignore possibility of the buffer overflow. >>>>> 2. Use our own check and possibly meet '\0' problems. >>>>> 3. Check for U_ILLEGAL_ARGUMENT_ERROR to treat as end of a string, >>>>> raise >>>>> the error for other 0xffff. >>>>> >>>>> Alex, what do you suggests here? >>>> As I understand, by now the 0xffff is used ONLY to handle the case >>>> of unexpectedly ended symbol. >>>> E.g. some symbol consists of 2 characters, but the length of the >>>> input buffer is 1. >>>> In my opinion this is the same as an invalid symbol. >>>> >>>> I guess that internal buffer overflow cannot occur in the >>>> `ucnv_getNextChar` function. >>>> >>>> I suppose that it is Nikitas duty to investigate this problem and >>>> explain it to us all. I just have noticed a strange usage. >>> >>> Hello, please consider my comments. >>> >>> There are some cases when 0xffff can occur, but: >>> 1) Cannot trigger in our context. >>> 2) Cannot trigger in our context. >>> 3) Only triggers if end < start. (Cannot happen in >>> sql_utf8_pattern_compare, i guess) >>> 4) Only triggers if string length > (size_t) 0x7ffffffff (can it >>> actually happen? I don’t think so). >>> 5) Occurs when trying to access to not unindexed data. >>> 6) Cannot occur in our context. >>> 7) Cannot occur in our context. >> I do not understand what are those numbers related to. Please, >> describe it. > > They are related to possible cases returning 0xffff from icu source > code (function ucnv_getNextUChar()). Can you just copy it here, so that anyone interested in that conversation can analyze it without looking for source files? > >>> >>> 0xfffd only means that symbol cannot be treated as a unicode symbol. >>> >>> Shall I change it somehow then? >>> >>> >>>> On 17 Aug 2018, at 12:23, Alex Khatskevich >>>> <avkhatskevich@tarantool.org <mailto:avkhatskevich@tarantool.org>> >>>> wrote: >>>> >>>> I have a look at icu code and It seems like 0xffff is an error, and >>>> it is more similar to >>>> invalid symbol that to "end of string". Check it, and fix the code, >>>> so that it is treated as >>>> an error. >>>> For example it is not handled in the main pattern loop: >>>> >>>> +while (pattern < pattern_end) { >>>> c = Utf8Read(pattern, pattern_end); >>>> +if (c == SQL_INVALID_UTF8_SYMBOL) >>>> +return SQL_INVALID_PATTERN; >>>> >>>> It seems like the 0xffff should be checked there too. >>> >>> No, it should not. This way it will only cause a bug when, for >>> example ’select “” like “”’ >>> will be treated as an error. >> I do not understand. >> ’select “” like “”’ should not even trap inside of the while loop >> (because`pattern < pattern_end` is false). > > Ah, you’re right, sorry, then it just doesn’t matter, since pattern < > pattern_end is equal > to 0xffff according to the comment above. > > -- > WBR, Nikita Tatunov. > n.tatunov@tarantool.org <mailto:n.tatunov@tarantool.org> > --------------3DEA5F9B877AB120207F99D8 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <p><br> </p> <br> <div class="moz-cite-prefix">On 11.09.2018 09:06, Nikita Tatunov wrote:<br> </div> <blockquote type="cite" cite="mid:58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <br class=""> <div><br class=""> <blockquote type="cite" class=""> <div class="">On 11 Sep 2018, at 01:20, Alex Khatskevich <<a href="mailto:avkhatskevich@tarantool.org" class="" moz-do-not-send="true">avkhatskevich@tarantool.org</a>> wrote:</div> <br class="Apple-interchange-newline"> <div class=""> <blockquote type="cite" cite="mid:87897608-173E-45EB-80A1-8B249706D8A1@tarantool.org" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;" class=""> <div class=""><br class="Apple-interchange-newline"> <br class=""> <blockquote type="cite" class=""> <div class="">On 17 Aug 2018, at 14:42, Alex Khatskevich <<a href="mailto:avkhatskevich@tarantool.org" class="" moz-do-not-send="true">avkhatskevich@tarantool.org</a>> wrote:</div> <br class="Apple-interchange-newline"> <div class=""><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">On 17.08.2018 14:17, Alexander Turenko wrote:</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">0xffff is the result of 'end of a string' check as well as internal buffer<br class=""> overflow error. I have the relevant code pasted in the first review of<br class=""> the patch (July, 18).<br class=""> <br class=""> // source/common/ucnv.c::ucnv_getNextUChar<br class=""> 1860 s=*source;<br class=""> 1861 if(sourceLimit<s) {<br class=""> 1862 *err=U_ILLEGAL_ARGUMENT_ERROR;<br class=""> 1863 return 0xffff;<br class=""> 1864 }<br class=""> <br class=""> We should not handle the buffer overflow case as an invalid symbol. Of<br class=""> course we should not handle it as the 'end of the string' situation.<br class=""> Ideally we should perform pointer myself and raise an error in case of<br class=""> 0xffff. I had thought that a buffer overflow error is unlikely to meet,<br class=""> but you are right: we should differentiate these situations.<br class=""> <br class=""> In one of the previous version of a patch we perform this check like so:<br class=""> <br class=""> #define Utf8Read(s, e) (((s) < (e)) ?\<br class=""> <span class="Apple-tab-span" style="white-space: pre;"> </span>ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0)<br class=""> <br class=""> Don't sure why it was changed. Maybe it is try to correctly handle '\0'<br class=""> symbol (it is valid unicode character)?<br class=""> </blockquote> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">The define you have pasted can return 0xffff.</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">The reasons to change it back are described in the previous patchset.</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">In short:</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">1. It is equivalent to</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;"> <span class="Apple-converted-space"> </span>a. check s < e in a while loop</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;"> <span class="Apple-converted-space"> </span>b. read next character inside of where loop body.</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">2. In some usages of the code this check (s<e) was redundant (it was performed a couple lines above)</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">3. There is no reason to rewrite the old version of this function. (So, we decided to use old version of the function)</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">So I see two ways to proceed:<br class=""> <br class=""> 1. Lean on icu's check and ignore possibility of the buffer overflow.<br class=""> 2. Use our own check and possibly meet '\0' problems.<br class=""> 3. Check for U_ILLEGAL_ARGUMENT_ERROR to treat as end of a string, raise<br class=""> the error for other 0xffff.<br class=""> <br class=""> Alex, what do you suggests here?<br class=""> </blockquote> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">As I understand, by now the 0xffff is used ONLY to handle the case of unexpectedly ended symbol.</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">E.g. some symbol consists of 2 characters, but the length of the input buffer is 1.</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">In my opinion this is the same as an invalid symbol.</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">I guess that internal buffer overflow cannot occur in the `ucnv_getNextChar` function.</span><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"> <span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">I suppose that it is Nikitas duty to investigate this problem and explain it to us all. I just have noticed a strange usage.</span></div> </blockquote> </div> <div class=""><br class=""> </div> <div class="">Hello, please consider my comments.</div> <div class=""><br class=""> </div> <div class="">There are some cases when 0xffff can occur, but:</div> <div class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>1) <span class="" style="font-family: HelveticaNeue;">Cannot trigger in our context.</span></div> <div class=""><span class="" style="font-family: HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;"> </span>2) C</span><span class="" style="font-family: HelveticaNeue;">annot trigger in our context.</span></div> <div class=""><span class="" style="font-family: HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;"> </span>3) O</span><span class="" style="font-family: HelveticaNeue;">nly triggers if end < start. (Cannot happen in sql_utf8_pattern_compare, i guess)</span></div> <div class=""><span class="" style="font-family: HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;"> </span>4) O</span><span class="" style="font-family: HelveticaNeue;">nly triggers if string length > (size_t) 0x7ffffffff (can it actually happen? I don’t think so).</span></div> <div class=""><span class="" style="font-family: HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;"> </span>5) O</span><span class="" style="font-family: HelveticaNeue;">ccurs when trying to access to not unindexed data.</span></div> <div class=""><span class="" style="font-family: HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;"> </span>6) Cannot occur in our context.</span></div> <div class=""><span class="" style="font-family: HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;"> </span>7) </span><span class="" style="font-family: HelveticaNeue;">Cannot occur in our context.</span></div> </blockquote> <span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none; float: none; display: inline !important;" class="">I do not understand what are those numbers related to. Please, describe it.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;" class=""> </div> </blockquote> <div><br class=""> </div> <div>They are related to possible cases returning 0xffff from icu source code (function ucnv_getNextUChar()).</div> </div> </blockquote> Can you just copy it here, so that anyone interested in that conversation can<br> analyze it without looking for source files?<br> <blockquote type="cite" cite="mid:58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org"> <div><br class=""> <blockquote type="cite" class=""> <div class=""> <blockquote type="cite" cite="mid:87897608-173E-45EB-80A1-8B249706D8A1@tarantool.org" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;" class=""> <div class=""><span class="" style="font-family: HelveticaNeue;"><br class=""> </span></div> <div class=""><span class="" style="font-family: HelveticaNeue;">0xfffd only means that symbol cannot be treated as a unicode symbol.</span></div> <div class=""><span class="" style="font-family: HelveticaNeue;"><br class=""> </span></div> <div class=""> <div class="">Shall I change it somehow then?</div> </div> <div class=""><br class=""> </div> <div class=""><br class=""> <blockquote type="cite" class=""> <div class="">On 17 Aug 2018, at 12:23, Alex Khatskevich <<a href="mailto:avkhatskevich@tarantool.org" class="" moz-do-not-send="true">avkhatskevich@tarantool.org</a>> wrote:</div> <br class="Apple-interchange-newline"> <div class=""><span class="" style="float: none; display: inline !important;">I have a look at icu code and It seems like 0xffff is an error, and it is more similar to</span><br class=""> <span class="" style="float: none; display: inline !important;">invalid symbol that to "end of string". Check it, and fix the code, so that it is treated as</span><br class=""> <span class="" style="float: none; display: inline !important;">an error.</span><br class=""> <span class="" style="float: none; display: inline !important;">For example it is not handled in the main pattern loop:</span><br class=""> <br class=""> <span class="" style="float: none; display: inline !important;">+</span><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="" style="float: none; display: inline !important;">while (pattern < pattern_end) {</span><br class=""> <span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="" style="float: none; display: inline !important;">c = Utf8Read(pattern, pattern_end);</span><br class=""> <span class="" style="float: none; display: inline !important;">+</span><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="" style="float: none; display: inline !important;">if (c == SQL_INVALID_UTF8_SYMBOL)</span><br class=""> <span class="" style="float: none; display: inline !important;">+</span><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="" style="float: none; display: inline !important;">return SQL_INVALID_PATTERN;</span><br class=""> <br class=""> <span class="" style="float: none; display: inline !important;">It seems like the 0xffff should be checked there too.</span></div> </blockquote> <br class=""> </div> <div class="">No, it should not. This way it will only cause a bug when, for example ’select “” like “”’</div> <div class="">will be treated as an error.</div> </blockquote> <span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none; float: none; display: inline !important;" class="">I do not understand.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;" class=""> <span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none; float: none; display: inline !important;" class="">’select “” like “”’ should not even trap inside of the while loop</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;" class=""> <span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none; float: none; display: inline !important;" class="">(because<span class="Apple-converted-space"> </span></span><span class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none; float: none; display: inline !important;">`pattern < pattern_end` is false).<br class=""> </span><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none; float: none; display: inline !important;" class=""></span></div> </blockquote> <br class=""> </div> <div>Ah, you’re right, sorry, then it just doesn’t matter, since pattern < pattern_end is equal</div> <div>to 0xffff according to the comment above.</div> <br class=""> <div class=""> <div dir="auto" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""> <div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">--</div> <div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">WBR, Nikita Tatunov.</div> <div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><a href="mailto:n.tatunov@tarantool.org" class="" moz-do-not-send="true">n.tatunov@tarantool.org</a></div> </div> </div> <br class=""> </blockquote> <br> </body> </html> --------------3DEA5F9B877AB120207F99D8--