[tarantool-patches] Re: [PATCH 2/2] sql: compute resulting collation for concatenation

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Jan 24 21:29:01 MSK 2019


Thanks for the patch! See 7 comments below.

On 16/01/2019 16:13, Nikita Pettik wrote:
> According to ANSI, result of concatenation operation should derive
> collation sequence from its operands. Now it is not true: result is
> always comes with no ("none") collation.
> 
> In a nutshell*, rules are quite simple:

1. If you want to put a reference, then use [<number>], like in
literature and like Vova does in his commits. Firstly, I thought
that '*' was a typo.

> 
> a) If some data type  has an explicit collation EC1, then every data

2. Double space after 'type'. In some other places below too.

> type that has an explicit collation shall have declared type collation
> that is EC1.  The collation derivation is explicit and the collation is
> EC1.
> 
> b) If every data type has an implicit collation, then:
> 
>   - If every data type has the same declared type collation IC1, then
>     the collation derivation is implicit and the collation is IC1.
> 
>   - Otherwise, the collation derivation is none.
> 
> c) Otherwise, the collation derivation is none.
> 
> *Read complete statement at 9.5 Result of data type combinations

3. Please, say a bit more words: that 9.5 is a chapter in an SQL
standard, and the standard of which year it is.

> 
> Closes #3937
> ---
>   src/box/sql/expr.c          |  47 +++++++++++++++++++-
>   test/sql/collation.result   | 102 ++++++++++++++++++++++++++++++++++++++++++++
>   test/sql/collation.test.lua |  46 ++++++++++++++++++++
>   3 files changed, 193 insertions(+), 2 deletions(-)
> 
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index f8819f779..e6f536757 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -221,6 +221,45 @@ sql_expr_coll(Parse *parse, Expr *p, bool *is_explicit_coll, uint32_t *coll_id)
>   			}
>   			break;
>   		}
> +		if (op == TK_CONCAT) {
> +			/*
> +			 * Despite the fact that procedure below
> +			 * is very similar to collation_check_compatability(),
> +			 * it is slightly different: when both
> +			 * operands have different implicit collations,
> +			 * derived collation should be "none",
> +			 * i.e. no collation is used at all
> +			 * (instead of raising error).

4. Typo: collation_check_compatability -> collation*S*_check_compat*I*bility.

Also, I think that it is not worth mentioning the difference here
with that function, especially in such a big comment, looks like an excuse.
It is better to put a link to the standard.

> +			 */
> +			bool is_lhs_forced;
> +			uint32_t lhs_coll_id;
> +			if (sql_expr_coll(parse, p->pLeft, &is_lhs_forced,
> +					  &lhs_coll_id) != 0)
> +				return -1;
> +			bool is_rhs_forced;
> +			uint32_t rhs_coll_id;
> +			if (sql_expr_coll(parse, p->pRight, &is_rhs_forced,
> +					  &rhs_coll_id) != 0)
> +				return -1;
> +			if (is_lhs_forced && is_rhs_forced) {
> +				if (lhs_coll_id != rhs_coll_id)
> +					return -1;

5. Did you miss diag_set?

> +			}
> +			if (is_lhs_forced) {
> +				*coll_id = lhs_coll_id;
> +				*is_explicit_coll = true;
> +				return 0;

6. In this function (sql_expr_coll) to break the cycle 'break'
keyword is used, so lets be consistent and use 'break' as well.

> +			}
> +			if (is_rhs_forced) {
> +				*coll_id = rhs_coll_id;
> +				*is_explicit_coll = true;
> +				return 0;
> +			}
> +			if (rhs_coll_id != lhs_coll_id)
> +				return 0;
> +			*coll_id = lhs_coll_id;
> +			return 0;
> +		}
>   		if (p->flags & EP_Collate) {
>   			if (p->pLeft && (p->pLeft->flags & EP_Collate) != 0) {
>   				p = p->pLeft;
> @@ -384,10 +423,14 @@ sql_binary_compare_coll_seq(Parse *parser, Expr *left, Expr *right)
>   	bool is_rhs_forced;
>   	uint32_t lhs_coll_id;
>   	uint32_t rhs_coll_id;
> -	if (sql_expr_coll(parser, left, &is_lhs_forced, &lhs_coll_id) != 0)
> +	if (sql_expr_coll(parser, left, &is_lhs_forced, &lhs_coll_id) != 0) {
> +		diag_set(ClientError, ER_ILLEGAL_COLLATION_MIX);
>   		goto err;
> -	if (sql_expr_coll(parser, right, &is_rhs_forced, &rhs_coll_id) != 0)
> +	}
> +	if (sql_expr_coll(parser, right, &is_rhs_forced, &rhs_coll_id) != 0) {
> +		diag_set(ClientError, ER_ILLEGAL_COLLATION_MIX);

7. Why do you set it here and not on the point 5 above? sql_expr_coll
can return an error not only because of illegal collation mix, but also
if a collation does not exist, for example.

>   		goto err;
> +	}
>   	uint32_t coll_id;
>   	if (collations_check_compatibility(lhs_coll_id, is_lhs_forced,
>   					   rhs_coll_id, is_rhs_forced,




More information about the Tarantool-patches mailing list