[tarantool-patches] Re: [PATCH 1/2 v2] test: check that collations isn't ignored in SELECTs

Tue Jul 2 20:21:02 MSK 2019

> On 28 Jun 2019, at 16:11, Roman Khabibov <roman.habibov at tarantool.org> wrote:
> 
> Hi! Thanks for the review.
> 
>> On Jun 25, 2019, at 8:30 PM, n.pettik <korablev at tarantool.org <mailto:korablev at tarantool.org>> wrote:
>>> On 24 Jun 2019, at 13:54, Roman Khabibov <roman.habibov at tarantool.org <mailto:roman.habibov at tarantool.org>> wrote:
>>> 
>>> Add test to check that a new collation isn't ignored regardless
>>> of a name of a previous one in the following patterns of quries:
>> 
>> Nit: quries -> queries
> Fixed.
> 
>>> 
>>> SELECT s COLLATE "unicode_ci" FROM a ORDER BY s COLLATE “unicode_ci"
>> 
>> Why do you consider this kind of query? What can be wrong with it?
>> Specifying collation for result set members doesn’t make much sense btw.
> I consider it, because it didn’t work after my addition of checking from the
> second patch:
> +static int
> +check_collate_arg(struct Parse *parse, struct Expr *expr)
> +{
> +	struct Expr *left = expr->pLeft;
> +	while (left->op == TK_COLLATE)
> +		left = left->pLeft;
> +	enum field_type type = sql_expr_type(left);
> +	if (type != FIELD_TYPE_STRING && type != FIELD_TYPE_SCALAR) {
> +		diag_set(ClientError, ER_SQL_PARSER_GENERIC,
> +			 "COLLATE clause can't be used with non-string "
> +			 "arguments");
> +		parse->is_aborted = true;
> +		return -1;
> +	}
> +	return 0;
> +}
> The tree of this query included sequence of collations on left nodes. This is
> the result of optimizations in multiSelectOrderBy. Now I can not exactly say how
> It occurs. At first time, I didn’t use loop to descend throughout left nodes.
> I added this loop, because of that reason.

Wait, in the example like this:

 SELECT s COLLATE "unicode_ci" FROM a ORDER BY s COLLATE “unicode_ci"

collate clauses refer to independent parts of query and
can’t contradict to each other. As I already said:

"First one is used during sorting
routines, second one is simply ignored (since values of
resulting set are not supposed to be compared with anything else)."

Why then are they organised into list? Please, investigate reasons
of this “optimisation”.

>>> Also note: It is disallowed to compare strings with different
>>> collations: ISO/IEC JTC 1/SC 32, Part 2: Foundation, page 531
>>> ---
>>> src/box/sql/resolve.c           |  7 +++++++
>>> test/sql-tap/collation.test.lua | 19 ++++++++++++++++++-
>>> 2 files changed, 25 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/src/box/sql/resolve.c b/src/box/sql/resolve.c
>>> index fdf3703da..348b3ea9a 100644
>>> --- a/src/box/sql/resolve.c
>>> +++ b/src/box/sql/resolve.c
>>> @@ -109,6 +109,13 @@ resolveAlias(Parse * pParse,	/* Parsing context */
>>> 		return;
>>> 	if (zType[0] != 'G')
>>> 		incrAggFunctionDepth(pDup, nSubquery);
>>> +	/*
>>> +	 * If there was typed more than one explicit collations in
>>> +	 * query, it will be a sequence of left nodes with the
>>> +	 * collations in a tree. There is nothing special about
>>> +	 * keeping the sequence. Only one collation could be
>>> +	 * stored, but the present solution is simpler.
>>> +	 */
>> 
>> Do not understand how mentioned example is related to this code.
>> I suppose you might mean example like:
>> 
>> SELECT s COLLATE “unicode” COLLATE “binary” COLLATE “unicode_ci” …
>> 
>> Is this syntax allowed by ANSI? If so, which one (first or last) collation must be used?
>> 
>>> 	if (pExpr->op == TK_COLLATE) {
>>> 		pDup =
>>> 		    sqlExprAddCollateString(pParse, pDup, pExpr->u.zToken);
>>> diff --git a/test/sql-tap/collation.test.lua b/test/sql-tap/collation.test.lua
>>> index 79547361c..f36540fc2 100755
>>> --- a/test/sql-tap/collation.test.lua
>>> +++ b/test/sql-tap/collation.test.lua
>>> @@ -1,6 +1,6 @@
>>> #!/usr/bin/env tarantool
>>> test = require("sqltester")
>>> -test:plan(174)
>>> +test:plan(177)
>>> 
>>> local prefix = "collation-"
>>> 
>>> @@ -529,4 +529,21 @@ test:do_catchsql_test(
>>>       'CREATE TABLE test3 (a int, b int, c int, PRIMARY KEY (a, a COLLATE foo, b, c))',
>>>       {1, "Collation 'FOO' does not exist"})
>>> 
>>> +-- gh-3805 Check that collation is not ignored.
>> 
>> Why these test cases are related to the mentioned ticket?
>> Which collation might be ignored? I see two collations in
>> two different parts of query. First one is used during sorting
>> routines, second one is simply ignored (since values of
>> resulting set are not supposed to be compared with anything else).
> See above.

See above what?:) Please drop this commit or provide
decent explanation. Now I can’t understand how provided
test is related to the ticket and what’s wrong with this case
at all.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20190702/2be0b2f4/attachment.html>