From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounce@freelists.org>
Received: from localhost (localhost [127.0.0.1])
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 4A9082A7D8
	for <tarantool-patches@freelists.org>; Tue, 11 Sep 2018 06:07:04 -0400 (EDT)
Received: from turing.freelists.org ([127.0.0.1])
	by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id X8brS_6AlyEd for <tarantool-patches@freelists.org>;
	Tue, 11 Sep 2018 06:07:04 -0400 (EDT)
Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 5955628BE3
	for <tarantool-patches@freelists.org>; Tue, 11 Sep 2018 06:07:03 -0400 (EDT)
Subject: [tarantool-patches] Re: [PATCH 1/2] sql: LIKE & GLOB pattern
 comparison issue
References: <cover.1534436835.git.n.tatunov@tarantool.org>
 <43febf82af3702fadfea135db978ffb6426eb00d.1534436836.git.n.tatunov@tarantool.org>
 <d11b496b-1d77-c1ef-27dd-874835bee1b9@tarantool.org>
 <20180817111727.y6nsbblpm5nh4n3g@tkn_work_nb>
 <436d256a-f9d0-781f-8cad-179d7322c7bd@tarantool.org>
 <87897608-173E-45EB-80A1-8B249706D8A1@tarantool.org>
 <6a1352e9-425c-d656-1bec-bb04d9f0fee6@tarantool.org>
 <58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org>
From: Alex Khatskevich <avkhatskevich@tarantool.org>
Message-ID: <860a125b-19f3-3bf1-8705-25156ff508ab@tarantool.org>
Date: Tue, 11 Sep 2018 13:06:57 +0300
MIME-Version: 1.0
In-Reply-To: <58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org>
Content-Type: multipart/alternative;
 boundary="------------3DEA5F9B877AB120207F99D8"
Content-Language: en-US
Sender: tarantool-patches-bounce@freelists.org
Errors-to: tarantool-patches-bounce@freelists.org
Reply-To: tarantool-patches@freelists.org
List-help: <mailto:ecartis@freelists.org?Subject=help>
List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: tarantool-patches <tarantool-patches.freelists.org>
List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe>
List-owner: <mailto:>
List-post: <mailto:tarantool-patches@freelists.org>
List-archive: <http://www.freelists.org/archives/tarantool-patches>
To: Nikita Tatunov <n.tatunov@tarantool.org>, tarantool-patches@freelists.org
Cc: Alexander Turenko <alexander.turenko@tarantool.org>, korablev@tarantool.org

This is a multi-part message in MIME format.
--------------3DEA5F9B877AB120207F99D8
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit



On 11.09.2018 09:06, Nikita Tatunov wrote:
>
>
>> On 11 Sep 2018, at 01:20, Alex Khatskevich 
>> <avkhatskevich@tarantool.org <mailto:avkhatskevich@tarantool.org>> wrote:
>>
>>>
>>>
>>>> On 17 Aug 2018, at 14:42, Alex Khatskevich 
>>>> <avkhatskevich@tarantool.org <mailto:avkhatskevich@tarantool.org>> 
>>>> wrote:
>>>>
>>>>
>>>> On 17.08.2018 14:17, Alexander Turenko wrote:
>>>>> 0xffff is the result of 'end of a string' check as well as 
>>>>> internal buffer
>>>>> overflow error. I have the relevant code pasted in the first review of
>>>>> the patch (July, 18).
>>>>>
>>>>> // source/common/ucnv.c::ucnv_getNextUChar
>>>>> 1860     s=*source;
>>>>> 1861     if(sourceLimit<s) {
>>>>> 1862         *err=U_ILLEGAL_ARGUMENT_ERROR;
>>>>> 1863         return 0xffff;
>>>>> 1864     }
>>>>>
>>>>> We should not handle the buffer overflow case as an invalid symbol. Of
>>>>> course we should not handle it as the 'end of the string' situation.
>>>>> Ideally we should perform pointer myself and raise an error in case of
>>>>> 0xffff. I had thought that a buffer overflow error is unlikely to 
>>>>> meet,
>>>>> but you are right: we should differentiate these situations.
>>>>>
>>>>> In one of the previous version of a patch we perform this check 
>>>>> like so:
>>>>>
>>>>> #define Utf8Read(s, e) (((s) < (e)) ?\
>>>>> ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0)
>>>>>
>>>>> Don't sure why it was changed. Maybe it is try to correctly handle 
>>>>> '\0'
>>>>> symbol (it is valid unicode character)?
>>>> The define you have pasted can return 0xffff.
>>>> The reasons to change it back are described in the previous patchset.
>>>> In short:
>>>> 1. It is equivalent to
>>>> a. check s < e in a while loop
>>>> b. read next character inside of where loop body.
>>>> 2. In some usages of the code this check (s<e) was redundant (it 
>>>> was performed a couple lines above)
>>>> 3. There is no reason to rewrite the old version of this function. 
>>>> (So, we decided to use old version of the function)
>>>>> So I see two ways to proceed:
>>>>>
>>>>> 1. Lean on icu's check and ignore possibility of the buffer overflow.
>>>>> 2. Use our own check and possibly meet '\0' problems.
>>>>> 3. Check for U_ILLEGAL_ARGUMENT_ERROR to treat as end of a string, 
>>>>> raise
>>>>>    the error for other 0xffff.
>>>>>
>>>>> Alex, what do you suggests here?
>>>> As I understand, by now the 0xffff is used ONLY to handle the case 
>>>> of unexpectedly ended symbol.
>>>> E.g. some symbol consists of 2 characters, but the length of the 
>>>> input buffer is 1.
>>>> In my opinion this is the same as an invalid symbol.
>>>>
>>>> I guess that internal buffer overflow cannot occur in the 
>>>> `ucnv_getNextChar` function.
>>>>
>>>> I suppose that it is Nikitas duty to investigate this problem and 
>>>> explain it to us all. I just have noticed a strange usage.
>>>
>>> Hello, please consider my comments.
>>>
>>> There are some cases when 0xffff can occur, but:
>>> 1) Cannot trigger in our context.
>>> 2) Cannot trigger in our context.
>>> 3) Only triggers if end < start. (Cannot happen in 
>>> sql_utf8_pattern_compare, i guess)
>>> 4) Only triggers if string length > (size_t) 0x7ffffffff (can it 
>>> actually happen? I don’t think so).
>>> 5) Occurs when trying to access to not unindexed data.
>>> 6) Cannot occur in our context.
>>> 7) Cannot occur in our context.
>> I do not understand what are those numbers related to. Please, 
>> describe it.
>
> They are related to possible cases returning 0xffff from icu source 
> code (function ucnv_getNextUChar()).
Can you just copy it here, so that anyone interested in that 
conversation can
analyze it without looking for source files?
>
>>>
>>> 0xfffd only means that symbol cannot be treated as a unicode symbol.
>>>
>>> Shall I change it somehow then?
>>>
>>>
>>>> On 17 Aug 2018, at 12:23, Alex Khatskevich 
>>>> <avkhatskevich@tarantool.org <mailto:avkhatskevich@tarantool.org>> 
>>>> wrote:
>>>>
>>>> I have a look at icu code and It seems like 0xffff is an error, and 
>>>> it is more similar to
>>>> invalid symbol that to "end of string". Check it, and fix the code, 
>>>> so that it is treated as
>>>> an error.
>>>> For example it is not handled in the main pattern loop:
>>>>
>>>> +while (pattern < pattern_end) {
>>>> c = Utf8Read(pattern, pattern_end);
>>>> +if (c == SQL_INVALID_UTF8_SYMBOL)
>>>> +return SQL_INVALID_PATTERN;
>>>>
>>>> It seems like the 0xffff should be checked there too.
>>>
>>> No, it should not. This way it will only cause a bug when, for 
>>> example ’select “” like “”’
>>> will be treated as an error.
>> I do not understand.
>> ’select “” like “”’ should not even trap inside of the while loop
>> (because`pattern < pattern_end` is false).
>
> Ah, you’re right, sorry, then it just doesn’t matter, since pattern < 
> pattern_end is equal
> to 0xffff according to the comment above.
>
> --
> WBR, Nikita Tatunov.
> n.tatunov@tarantool.org <mailto:n.tatunov@tarantool.org>
>


--------------3DEA5F9B877AB120207F99D8
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 11.09.2018 09:06, Nikita Tatunov
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <br class="">
      <div><br class="">
        <blockquote type="cite" class="">
          <div class="">On 11 Sep 2018, at 01:20, Alex Khatskevich &lt;<a
              href="mailto:avkhatskevich@tarantool.org" class=""
              moz-do-not-send="true">avkhatskevich@tarantool.org</a>&gt;
            wrote:</div>
          <br class="Apple-interchange-newline">
          <div class="">
            <blockquote type="cite"
              cite="mid:87897608-173E-45EB-80A1-8B249706D8A1@tarantool.org"
              style="font-family: Helvetica; font-size: 12px;
              font-style: normal; font-variant-caps: normal;
              font-weight: normal; letter-spacing: normal; orphans:
              auto; text-align: start; text-indent: 0px; text-transform:
              none; white-space: normal; widows: auto; word-spacing:
              0px; -webkit-text-size-adjust: auto;
              -webkit-text-stroke-width: 0px; background-color: rgb(255,
              255, 255); text-decoration: none;" class="">
              <div class=""><br class="Apple-interchange-newline">
                <br class="">
                <blockquote type="cite" class="">
                  <div class="">On 17 Aug 2018, at 14:42, Alex
                    Khatskevich &lt;<a
                      href="mailto:avkhatskevich@tarantool.org" class=""
                      moz-do-not-send="true">avkhatskevich@tarantool.org</a>&gt;
                    wrote:</div>
                  <br class="Apple-interchange-newline">
                  <div class=""><br class="" style="caret-color: rgb(0,
                      0, 0); font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">On 17.08.2018 14:17, Alexander
                      Turenko wrote:</span><br class=""
                      style="caret-color: rgb(0, 0, 0); font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">
                    <blockquote type="cite" class="" style="font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">0xffff is the result of 'end of a string'
                      check as well as internal buffer<br class="">
                      overflow error. I have the relevant code pasted in
                      the first review of<br class="">
                      the patch (July, 18).<br class="">
                      <br class="">
                      // source/common/ucnv.c::ucnv_getNextUChar<br
                        class="">
                      1860     s=*source;<br class="">
                      1861     if(sourceLimit&lt;s) {<br class="">
                      1862         *err=U_ILLEGAL_ARGUMENT_ERROR;<br
                        class="">
                      1863         return 0xffff;<br class="">
                      1864     }<br class="">
                      <br class="">
                      We should not handle the buffer overflow case as
                      an invalid symbol. Of<br class="">
                      course we should not handle it as the 'end of the
                      string' situation.<br class="">
                      Ideally we should perform pointer myself and raise
                      an error in case of<br class="">
                      0xffff. I had thought that a buffer overflow error
                      is unlikely to meet,<br class="">
                      but you are right: we should differentiate these
                      situations.<br class="">
                      <br class="">
                      In one of the previous version of a patch we
                      perform this check like so:<br class="">
                      <br class="">
                      #define Utf8Read(s, e) (((s) &lt; (e)) ?\<br
                        class="">
                      <span class="Apple-tab-span" style="white-space: pre;">	</span>ucnv_getNextUChar(pUtf8conv,
                      &amp;s, e, &amp;status) : 0)<br class="">
                      <br class="">
                      Don't sure why it was changed. Maybe it is try to
                      correctly handle '\0'<br class="">
                      symbol (it is valid unicode character)?<br
                        class="">
                    </blockquote>
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">The define you have pasted can
                      return 0xffff.</span><br class=""
                      style="caret-color: rgb(0, 0, 0); font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">The reasons to change it back
                      are described in the previous patchset.</span><br
                      class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">In short:</span><br class=""
                      style="caret-color: rgb(0, 0, 0); font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">1. It is equivalent to</span><br
                      class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">  <span
                        class="Apple-converted-space"> </span>a. check s
                      &lt; e in a while loop</span><br class=""
                      style="caret-color: rgb(0, 0, 0); font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">  <span
                        class="Apple-converted-space"> </span>b. read
                      next character inside of where loop body.</span><br
                      class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">2. In some usages of the code
                      this check (s&lt;e) was redundant (it was
                      performed a couple lines above)</span><br class=""
                      style="caret-color: rgb(0, 0, 0); font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">3. There is no reason to
                      rewrite the old version of this function. (So, we
                      decided to use old version of the function)</span><br
                      class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <blockquote type="cite" class="" style="font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">So I see two ways to proceed:<br class="">
                      <br class="">
                      1. Lean on icu's check and ignore possibility of
                      the buffer overflow.<br class="">
                      2. Use our own check and possibly meet '\0'
                      problems.<br class="">
                      3. Check for U_ILLEGAL_ARGUMENT_ERROR to treat as
                      end of a string, raise<br class="">
                         the error for other 0xffff.<br class="">
                      <br class="">
                      Alex, what do you suggests here?<br class="">
                    </blockquote>
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">As I understand, by now the
                      0xffff is used ONLY to handle the case of
                      unexpectedly ended symbol.</span><br class=""
                      style="caret-color: rgb(0, 0, 0); font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">E.g. some symbol consists of 2
                      characters, but the length of the input buffer is
                      1.</span><br class="" style="caret-color: rgb(0,
                      0, 0); font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">In my opinion this is the same
                      as an invalid symbol.</span><br class=""
                      style="caret-color: rgb(0, 0, 0); font-family:
                      Helvetica; font-size: 12px; font-style: normal;
                      font-variant-caps: normal; font-weight: normal;
                      letter-spacing: normal; text-align: start;
                      text-indent: 0px; text-transform: none;
                      white-space: normal; word-spacing: 0px;
                      -webkit-text-stroke-width: 0px; text-decoration:
                      none;">
                    <br class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">I guess that internal buffer
                      overflow cannot occur in the `ucnv_getNextChar`
                      function.</span><br class="" style="caret-color:
                      rgb(0, 0, 0); font-family: Helvetica; font-size:
                      12px; font-style: normal; font-variant-caps:
                      normal; font-weight: normal; letter-spacing:
                      normal; text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <br class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none;">
                    <span class="" style="caret-color: rgb(0, 0, 0);
                      font-family: Helvetica; font-size: 12px;
                      font-style: normal; font-variant-caps: normal;
                      font-weight: normal; letter-spacing: normal;
                      text-align: start; text-indent: 0px;
                      text-transform: none; white-space: normal;
                      word-spacing: 0px; -webkit-text-stroke-width: 0px;
                      text-decoration: none; float: none; display:
                      inline !important;">I suppose that it is Nikitas
                      duty to investigate this problem and explain it to
                      us all. I just have noticed a strange usage.</span></div>
                </blockquote>
              </div>
              <div class=""><br class="">
              </div>
              <div class="">Hello, please consider my comments.</div>
              <div class=""><br class="">
              </div>
              <div class="">There are some cases when 0xffff can occur,
                but:</div>
              <div class=""><span class="Apple-tab-span" style="white-space: pre;">	</span>1) <span
                  class="" style="font-family: HelveticaNeue;">Cannot
                  trigger in our context.</span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;">	</span>2)
                  C</span><span class="" style="font-family:
                  HelveticaNeue;">annot trigger in our context.</span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;">	</span>3)
                  O</span><span class="" style="font-family:
                  HelveticaNeue;">nly triggers if end &lt; start.
                  (Cannot happen in sql_utf8_pattern_compare, i guess)</span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;">	</span>4)
                  O</span><span class="" style="font-family:
                  HelveticaNeue;">nly triggers if string length &gt;
                  (size_t) 0x7ffffffff (can it actually happen? I don’t
                  think so).</span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;">	</span>5)
                  O</span><span class="" style="font-family:
                  HelveticaNeue;">ccurs when trying to access to not
                  unindexed data.</span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;">	</span>6)
                  Cannot occur in our context.</span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><span class="Apple-tab-span" style="white-space: pre;">	</span>7) </span><span
                  class="" style="font-family: HelveticaNeue;">Cannot
                  occur in our context.</span></div>
            </blockquote>
            <span style="caret-color: rgb(0, 0, 0); font-family:
              Helvetica; font-size: 12px; font-style: normal;
              font-variant-caps: normal; font-weight: normal;
              letter-spacing: normal; text-align: start; text-indent:
              0px; text-transform: none; white-space: normal;
              word-spacing: 0px; -webkit-text-stroke-width: 0px;
              background-color: rgb(255, 255, 255); text-decoration:
              none; float: none; display: inline !important;" class="">I
              do not understand what are those numbers related to.
              Please, describe it.</span><br style="caret-color: rgb(0,
              0, 0); font-family: Helvetica; font-size: 12px;
              font-style: normal; font-variant-caps: normal;
              font-weight: normal; letter-spacing: normal; text-align:
              start; text-indent: 0px; text-transform: none;
              white-space: normal; word-spacing: 0px;
              -webkit-text-stroke-width: 0px; background-color: rgb(255,
              255, 255); text-decoration: none;" class="">
          </div>
        </blockquote>
        <div><br class="">
        </div>
        <div>They are related to possible cases returning 0xffff from
          icu source code (function ucnv_getNextUChar()).</div>
      </div>
    </blockquote>
    Can you just copy it here, so that anyone interested in that
    conversation can<br>
    analyze it without looking for source files?<br>
    <blockquote type="cite"
      cite="mid:58B407E2-AF5D-4531-A9FF-9DC57CE0070B@tarantool.org">
      <div><br class="">
        <blockquote type="cite" class="">
          <div class="">
            <blockquote type="cite"
              cite="mid:87897608-173E-45EB-80A1-8B249706D8A1@tarantool.org"
              style="font-family: Helvetica; font-size: 12px;
              font-style: normal; font-variant-caps: normal;
              font-weight: normal; letter-spacing: normal; orphans:
              auto; text-align: start; text-indent: 0px; text-transform:
              none; white-space: normal; widows: auto; word-spacing:
              0px; -webkit-text-size-adjust: auto;
              -webkit-text-stroke-width: 0px; background-color: rgb(255,
              255, 255); text-decoration: none;" class="">
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><br class="">
                </span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;">0xfffd only means that symbol cannot
                  be treated as a unicode symbol.</span></div>
              <div class=""><span class="" style="font-family:
                  HelveticaNeue;"><br class="">
                </span></div>
              <div class="">
                <div class="">Shall I change it somehow then?</div>
              </div>
              <div class=""><br class="">
              </div>
              <div class=""><br class="">
                <blockquote type="cite" class="">
                  <div class="">On 17 Aug 2018, at 12:23, Alex
                    Khatskevich &lt;<a
                      href="mailto:avkhatskevich@tarantool.org" class=""
                      moz-do-not-send="true">avkhatskevich@tarantool.org</a>&gt;
                    wrote:</div>
                  <br class="Apple-interchange-newline">
                  <div class=""><span class="" style="float: none;
                      display: inline !important;">I have a look at icu
                      code and It seems like 0xffff is an error, and it
                      is more similar to</span><br class="">
                    <span class="" style="float: none; display: inline
                      !important;">invalid symbol that to "end of
                      string". Check it, and fix the code, so that it is
                      treated as</span><br class="">
                    <span class="" style="float: none; display: inline
                      !important;">an error.</span><br class="">
                    <span class="" style="float: none; display: inline
                      !important;">For example it is not handled in the
                      main pattern loop:</span><br class="">
                    <br class="">
                    <span class="" style="float: none; display: inline
                      !important;">+</span><span class="Apple-tab-span" style="white-space: pre;">	</span><span
                      class="" style="float: none; display: inline
                      !important;">while (pattern &lt; pattern_end) {</span><br
                      class="">
                    <span class="Apple-tab-span" style="white-space: pre;">	</span><span class="Apple-tab-span" style="white-space: pre;">	</span><span
                      class="" style="float: none; display: inline
                      !important;">c = Utf8Read(pattern, pattern_end);</span><br
                      class="">
                    <span class="" style="float: none; display: inline
                      !important;">+</span><span class="Apple-tab-span" style="white-space: pre;">	</span><span class="Apple-tab-span" style="white-space: pre;">	</span><span
                      class="" style="float: none; display: inline
                      !important;">if (c == SQL_INVALID_UTF8_SYMBOL)</span><br
                      class="">
                    <span class="" style="float: none; display: inline
                      !important;">+</span><span class="Apple-tab-span" style="white-space: pre;">	</span><span class="Apple-tab-span" style="white-space: pre;">	</span><span class="Apple-tab-span" style="white-space: pre;">	</span><span
                      class="" style="float: none; display: inline
                      !important;">return SQL_INVALID_PATTERN;</span><br
                      class="">
                    <br class="">
                    <span class="" style="float: none; display: inline
                      !important;">It seems like the 0xffff should be
                      checked there too.</span></div>
                </blockquote>
                <br class="">
              </div>
              <div class="">No, it should not. This way it will only
                cause a bug when, for example ’select “” like “”’</div>
              <div class="">will be treated as an error.</div>
            </blockquote>
            <span style="caret-color: rgb(0, 0, 0); font-family:
              Helvetica; font-size: 12px; font-style: normal;
              font-variant-caps: normal; font-weight: normal;
              letter-spacing: normal; text-align: start; text-indent:
              0px; text-transform: none; white-space: normal;
              word-spacing: 0px; -webkit-text-stroke-width: 0px;
              background-color: rgb(255, 255, 255); text-decoration:
              none; float: none; display: inline !important;" class="">I
              do not understand.</span><br style="caret-color: rgb(0, 0,
              0); font-family: Helvetica; font-size: 12px; font-style:
              normal; font-variant-caps: normal; font-weight: normal;
              letter-spacing: normal; text-align: start; text-indent:
              0px; text-transform: none; white-space: normal;
              word-spacing: 0px; -webkit-text-stroke-width: 0px;
              background-color: rgb(255, 255, 255); text-decoration:
              none;" class="">
            <span style="caret-color: rgb(0, 0, 0); font-family:
              Helvetica; font-size: 12px; font-style: normal;
              font-variant-caps: normal; font-weight: normal;
              letter-spacing: normal; text-align: start; text-indent:
              0px; text-transform: none; white-space: normal;
              word-spacing: 0px; -webkit-text-stroke-width: 0px;
              background-color: rgb(255, 255, 255); text-decoration:
              none; float: none; display: inline !important;" class="">’select
              “” like “”’ should not even trap inside of the while loop</span><br
              style="caret-color: rgb(0, 0, 0); font-family: Helvetica;
              font-size: 12px; font-style: normal; font-variant-caps:
              normal; font-weight: normal; letter-spacing: normal;
              text-align: start; text-indent: 0px; text-transform: none;
              white-space: normal; word-spacing: 0px;
              -webkit-text-stroke-width: 0px; background-color: rgb(255,
              255, 255); text-decoration: none;" class="">
            <span style="caret-color: rgb(0, 0, 0); font-family:
              Helvetica; font-size: 12px; font-style: normal;
              font-variant-caps: normal; font-weight: normal;
              letter-spacing: normal; text-align: start; text-indent:
              0px; text-transform: none; white-space: normal;
              word-spacing: 0px; -webkit-text-stroke-width: 0px;
              background-color: rgb(255, 255, 255); text-decoration:
              none; float: none; display: inline !important;" class="">(because<span
                class="Apple-converted-space"> </span></span><span
              class="" style="caret-color: rgb(0, 0, 0); font-family:
              Helvetica; font-size: 12px; font-style: normal;
              font-variant-caps: normal; font-weight: normal;
              letter-spacing: normal; text-align: start; text-indent:
              0px; text-transform: none; white-space: normal;
              word-spacing: 0px; -webkit-text-stroke-width: 0px;
              background-color: rgb(255, 255, 255); text-decoration:
              none; float: none; display: inline !important;">`pattern
              &lt; pattern_end` is false).<br class="">
            </span><span style="caret-color: rgb(0, 0, 0); font-family:
              Helvetica; font-size: 12px; font-style: normal;
              font-variant-caps: normal; font-weight: normal;
              letter-spacing: normal; text-align: start; text-indent:
              0px; text-transform: none; white-space: normal;
              word-spacing: 0px; -webkit-text-stroke-width: 0px;
              background-color: rgb(255, 255, 255); text-decoration:
              none; float: none; display: inline !important;" class=""></span></div>
        </blockquote>
        <br class="">
      </div>
      <div>Ah, you’re right, sorry, then it just doesn’t matter, since
        pattern &lt; pattern_end is equal</div>
      <div>to 0xffff according to the comment above.</div>
      <br class="">
      <div class="">
        <div dir="auto" style="word-wrap: break-word; -webkit-nbsp-mode:
          space; line-break: after-white-space;" class="">
          <div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);
            font-family: Helvetica; font-size: 12px; font-style: normal;
            font-variant-caps: normal; font-weight: normal;
            letter-spacing: normal; text-align: start; text-indent: 0px;
            text-transform: none; white-space: normal; word-spacing:
            0px; -webkit-text-stroke-width: 0px; text-decoration: none;">--</div>
          <div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);
            font-family: Helvetica; font-size: 12px; font-style: normal;
            font-variant-caps: normal; font-weight: normal;
            letter-spacing: normal; text-align: start; text-indent: 0px;
            text-transform: none; white-space: normal; word-spacing:
            0px; -webkit-text-stroke-width: 0px; text-decoration: none;">WBR,
            Nikita Tatunov.</div>
          <div style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);
            font-family: Helvetica; font-size: 12px; font-style: normal;
            font-variant-caps: normal; font-weight: normal;
            letter-spacing: normal; text-align: start; text-indent: 0px;
            text-transform: none; white-space: normal; word-spacing:
            0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><a
              href="mailto:n.tatunov@tarantool.org" class=""
              moz-do-not-send="true">n.tatunov@tarantool.org</a></div>
        </div>
      </div>
      <br class="">
    </blockquote>
    <br>
  </body>
</html>

--------------3DEA5F9B877AB120207F99D8--