From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 75694262B9 for ; Thu, 24 Jan 2019 13:29:17 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 46v4YqFW2ZRX for ; Thu, 24 Jan 2019 13:29:17 -0500 (EST) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 9E34B2631D for ; Thu, 24 Jan 2019 13:29:16 -0500 (EST) Subject: [tarantool-patches] Re: [PATCH 2/2] sql: compute resulting collation for concatenation References: <652a9e6a4514a03ef93133961b09c2f5d45721d8.1547644180.git.korablev@tarantool.org> From: Vladislav Shpilevoy Message-ID: <0c99e4e0-972a-2013-45b1-5c60747ee2ef@tarantool.org> Date: Thu, 24 Jan 2019 21:29:01 +0300 MIME-Version: 1.0 In-Reply-To: <652a9e6a4514a03ef93133961b09c2f5d45721d8.1547644180.git.korablev@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: Nikita Pettik , tarantool-patches@freelists.org Thanks for the patch! See 7 comments below. On 16/01/2019 16:13, Nikita Pettik wrote: > According to ANSI, result of concatenation operation should derive > collation sequence from its operands. Now it is not true: result is > always comes with no ("none") collation. > > In a nutshell*, rules are quite simple: 1. If you want to put a reference, then use [], like in literature and like Vova does in his commits. Firstly, I thought that '*' was a typo. > > a) If some data type has an explicit collation EC1, then every data 2. Double space after 'type'. In some other places below too. > type that has an explicit collation shall have declared type collation > that is EC1. The collation derivation is explicit and the collation is > EC1. > > b) If every data type has an implicit collation, then: > > - If every data type has the same declared type collation IC1, then > the collation derivation is implicit and the collation is IC1. > > - Otherwise, the collation derivation is none. > > c) Otherwise, the collation derivation is none. > > *Read complete statement at 9.5 Result of data type combinations 3. Please, say a bit more words: that 9.5 is a chapter in an SQL standard, and the standard of which year it is. > > Closes #3937 > --- > src/box/sql/expr.c | 47 +++++++++++++++++++- > test/sql/collation.result | 102 ++++++++++++++++++++++++++++++++++++++++++++ > test/sql/collation.test.lua | 46 ++++++++++++++++++++ > 3 files changed, 193 insertions(+), 2 deletions(-) > > diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c > index f8819f779..e6f536757 100644 > --- a/src/box/sql/expr.c > +++ b/src/box/sql/expr.c > @@ -221,6 +221,45 @@ sql_expr_coll(Parse *parse, Expr *p, bool *is_explicit_coll, uint32_t *coll_id) > } > break; > } > + if (op == TK_CONCAT) { > + /* > + * Despite the fact that procedure below > + * is very similar to collation_check_compatability(), > + * it is slightly different: when both > + * operands have different implicit collations, > + * derived collation should be "none", > + * i.e. no collation is used at all > + * (instead of raising error). 4. Typo: collation_check_compatability -> collation*S*_check_compat*I*bility. Also, I think that it is not worth mentioning the difference here with that function, especially in such a big comment, looks like an excuse. It is better to put a link to the standard. > + */ > + bool is_lhs_forced; > + uint32_t lhs_coll_id; > + if (sql_expr_coll(parse, p->pLeft, &is_lhs_forced, > + &lhs_coll_id) != 0) > + return -1; > + bool is_rhs_forced; > + uint32_t rhs_coll_id; > + if (sql_expr_coll(parse, p->pRight, &is_rhs_forced, > + &rhs_coll_id) != 0) > + return -1; > + if (is_lhs_forced && is_rhs_forced) { > + if (lhs_coll_id != rhs_coll_id) > + return -1; 5. Did you miss diag_set? > + } > + if (is_lhs_forced) { > + *coll_id = lhs_coll_id; > + *is_explicit_coll = true; > + return 0; 6. In this function (sql_expr_coll) to break the cycle 'break' keyword is used, so lets be consistent and use 'break' as well. > + } > + if (is_rhs_forced) { > + *coll_id = rhs_coll_id; > + *is_explicit_coll = true; > + return 0; > + } > + if (rhs_coll_id != lhs_coll_id) > + return 0; > + *coll_id = lhs_coll_id; > + return 0; > + } > if (p->flags & EP_Collate) { > if (p->pLeft && (p->pLeft->flags & EP_Collate) != 0) { > p = p->pLeft; > @@ -384,10 +423,14 @@ sql_binary_compare_coll_seq(Parse *parser, Expr *left, Expr *right) > bool is_rhs_forced; > uint32_t lhs_coll_id; > uint32_t rhs_coll_id; > - if (sql_expr_coll(parser, left, &is_lhs_forced, &lhs_coll_id) != 0) > + if (sql_expr_coll(parser, left, &is_lhs_forced, &lhs_coll_id) != 0) { > + diag_set(ClientError, ER_ILLEGAL_COLLATION_MIX); > goto err; > - if (sql_expr_coll(parser, right, &is_rhs_forced, &rhs_coll_id) != 0) > + } > + if (sql_expr_coll(parser, right, &is_rhs_forced, &rhs_coll_id) != 0) { > + diag_set(ClientError, ER_ILLEGAL_COLLATION_MIX); 7. Why do you set it here and not on the point 5 above? sql_expr_coll can return an error not only because of illegal collation mix, but also if a collation does not exist, for example. > goto err; > + } > uint32_t coll_id; > if (collations_check_compatibility(lhs_coll_id, is_lhs_forced, > rhs_coll_id, is_rhs_forced,