From: Alex Khatskevich <avkhatskevich@tarantool.org>
To: Alexander Turenko <alexander.turenko@tarantool.org>
Cc: "N.Tatunov" <n.tatunov@tarantool.org>,
tarantool-patches@freelists.org,
"N.Tatunov" <hollow653@gmail.com>
Subject: [tarantool-patches] Re: [PATCH 1/2] sql: LIKE & GLOB pattern comparison issue
Date: Fri, 17 Aug 2018 14:42:47 +0300 [thread overview]
Message-ID: <436d256a-f9d0-781f-8cad-179d7322c7bd@tarantool.org> (raw)
In-Reply-To: <20180817111727.y6nsbblpm5nh4n3g@tkn_work_nb>
On 17.08.2018 14:17, Alexander Turenko wrote:
> 0xffff is the result of 'end of a string' check as well as internal buffer
> overflow error. I have the relevant code pasted in the first review of
> the patch (July, 18).
>
> // source/common/ucnv.c::ucnv_getNextUChar
> 1860 s=*source;
> 1861 if(sourceLimit<s) {
> 1862 *err=U_ILLEGAL_ARGUMENT_ERROR;
> 1863 return 0xffff;
> 1864 }
>
> We should not handle the buffer overflow case as an invalid symbol. Of
> course we should not handle it as the 'end of the string' situation.
> Ideally we should perform pointer myself and raise an error in case of
> 0xffff. I had thought that a buffer overflow error is unlikely to meet,
> but you are right: we should differentiate these situations.
>
> In one of the previous version of a patch we perform this check like so:
>
> #define Utf8Read(s, e) (((s) < (e)) ?\
> ucnv_getNextUChar(pUtf8conv, &s, e, &status) : 0)
>
> Don't sure why it was changed. Maybe it is try to correctly handle '\0'
> symbol (it is valid unicode character)?
The define you have pasted can return 0xffff.
The reasons to change it back are described in the previous patchset.
In short:
1. It is equivalent to
a. check s < e in a while loop
b. read next character inside of where loop body.
2. In some usages of the code this check (s<e) was redundant (it was
performed a couple lines above)
3. There is no reason to rewrite the old version of this function. (So,
we decided to use old version of the function)
> So I see two ways to proceed:
>
> 1. Lean on icu's check and ignore possibility of the buffer overflow.
> 2. Use our own check and possibly meet '\0' problems.
> 3. Check for U_ILLEGAL_ARGUMENT_ERROR to treat as end of a string, raise
> the error for other 0xffff.
>
> Alex, what do you suggests here?
As I understand, by now the 0xffff is used ONLY to handle the case of
unexpectedly ended symbol.
E.g. some symbol consists of 2 characters, but the length of the input
buffer is 1.
In my opinion this is the same as an invalid symbol.
I guess that internal buffer overflow cannot occur in the
`ucnv_getNextChar` function.
I suppose that it is Nikitas duty to investigate this problem and
explain it to us all. I just have noticed a strange usage.
next prev parent reply other threads:[~2018-08-17 11:42 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-16 17:00 [tarantool-patches] [PATCH v2 0/2] sql: pattern comparison fixes & GLOB removal N.Tatunov
2018-08-16 17:00 ` [tarantool-patches] [PATCH 1/2] sql: LIKE & GLOB pattern comparison issue N.Tatunov
2018-08-17 9:23 ` [tarantool-patches] " Alex Khatskevich
2018-08-17 11:17 ` Alexander Turenko
2018-08-17 11:42 ` Alex Khatskevich [this message]
2018-09-09 13:33 ` Nikita Tatunov
2018-09-10 22:20 ` Alex Khatskevich
2018-09-11 6:06 ` Nikita Tatunov
2018-09-11 10:06 ` Alex Khatskevich
2018-09-11 13:31 ` Nikita Tatunov
2018-10-18 18:02 ` Nikita Tatunov
2018-10-21 3:51 ` Alexander Turenko
2018-10-26 15:19 ` Nikita Tatunov
2018-10-29 13:01 ` Alexander Turenko
2018-10-31 5:25 ` Nikita Tatunov
2018-11-01 10:30 ` Alexander Turenko
2018-11-14 14:16 ` n.pettik
2018-11-14 17:06 ` Alexander Turenko
2018-08-16 17:00 ` [tarantool-patches] [PATCH 2/2] sql: remove GLOB from Tarantool N.Tatunov
2018-08-17 8:25 ` [tarantool-patches] " Alex Khatskevich
2018-08-17 8:49 ` n.pettik
2018-08-17 9:01 ` Alex Khatskevich
2018-08-17 9:20 ` n.pettik
2018-08-17 9:28 ` Alex Khatskevich
[not found] ` <04D02794-07A5-4146-9144-84EE720C8656@corp.mail.ru>
2018-08-17 8:53 ` Alex Khatskevich
2018-08-17 11:26 ` Alexander Turenko
2018-08-17 11:34 ` Alexander Turenko
2018-08-17 13:46 ` Nikita Tatunov
2018-09-09 14:57 ` Nikita Tatunov
2018-09-10 22:06 ` Alex Khatskevich
2018-09-11 7:38 ` Nikita Tatunov
2018-09-11 10:11 ` Alexander Turenko
2018-09-11 10:22 ` Alex Khatskevich
2018-09-11 12:03 ` Alex Khatskevich
2018-10-18 20:28 ` Nikita Tatunov
2018-10-21 3:48 ` Alexander Turenko
2018-10-26 15:21 ` Nikita Tatunov
2018-10-29 12:15 ` Alexander Turenko
2018-11-08 15:09 ` Nikita Tatunov
2018-11-09 12:18 ` Alexander Turenko
2018-11-10 3:38 ` Nikita Tatunov
2018-11-13 19:23 ` Alexander Turenko
2018-11-14 14:16 ` n.pettik
2018-11-14 17:41 ` Alexander Turenko
2018-11-14 21:48 ` n.pettik
2018-11-15 4:57 ` [tarantool-patches] Re: [PATCH v2 0/2] sql: pattern comparison fixes & GLOB removal Kirill Yukhin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=436d256a-f9d0-781f-8cad-179d7322c7bd@tarantool.org \
--to=avkhatskevich@tarantool.org \
--cc=alexander.turenko@tarantool.org \
--cc=hollow653@gmail.com \
--cc=n.tatunov@tarantool.org \
--cc=tarantool-patches@freelists.org \
--subject='[tarantool-patches] Re: [PATCH 1/2] sql: LIKE & GLOB pattern comparison issue' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox