Tarantool development patches archive
 help / color / mirror / Atom feed
From: "n.pettik" <korablev@tarantool.org>
To: tarantool-patches@freelists.org
Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH 2/2] sql: compute resulting collation for concatenation
Date: Fri, 15 Feb 2019 02:26:27 +0300	[thread overview]
Message-ID: <5AD6A399-D158-44CF-9180-F57185F2C376@tarantool.org> (raw)
In-Reply-To: <0c99e4e0-972a-2013-45b1-5c60747ee2ef@tarantool.org>



> On 24 Jan 2019, at 21:29, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> wrote:
> 
> Thanks for the patch! See 7 comments below.

Sorry for delayed answer.

> On 16/01/2019 16:13, Nikita Pettik wrote:
>> According to ANSI, result of concatenation operation should derive
>> collation sequence from its operands. Now it is not true: result is
>> always comes with no ("none") collation.
>> In a nutshell*, rules are quite simple:
> 
> 1. If you want to put a reference, then use [<number>], like in
> literature and like Vova does in his commits. Firstly, I thought
> that '*' was a typo.

Ok, fixed.

>> a) If some data type  has an explicit collation EC1, then every data
> 
> 2. Double space after 'type'. In some other places below too.

Ok, fixed. As a rule I use vim auto-formatting utility, which aligns
borders to 72 chars automatically putting extra spaces between
sentences. I think it is kind of normal.

>> type that has an explicit collation shall have declared type collation
>> that is EC1.  The collation derivation is explicit and the collation is
>> EC1.
>> b) If every data type has an implicit collation, then:
>>  - If every data type has the same declared type collation IC1, then
>>    the collation derivation is implicit and the collation is IC1.
>>  - Otherwise, the collation derivation is none.
>> c) Otherwise, the collation derivation is none.
>> *Read complete statement at 9.5 Result of data type combinations
> 
> 3. Please, say a bit more words: that 9.5 is a chapter in an SQL
> standard, and the standard of which year it is.

Ok, done.

Fixed commit message:

Author: Nikita Pettik <korablev@tarantool.org>
Date:   Tue Jan 15 15:18:37 2019 +0300

    sql: compute resulting collation for concatenation
    
    According to ANSI, result of concatenation operation should derive
    collation sequence from its operands. Now it is not true: result is
    always comes with no ("none") collation.
    
    In a nutshell[1], rules are quite simple:
    
    a) If some data type has an explicit collation EC1, then every data type
    that has an explicit collation shall have declared type collation that
    is EC1.  The collation derivation is explicit and the collation is EC1.
    
    b) If every data type has an implicit collation, then:
    
     - If every data type has the same declared type collation IC1, then
       the collation derivation is implicit and the collation is IC1.
    
     - Otherwise, the collation derivation is none.
    
    c) Otherwise, the collation derivation is none.
    
    [1] Read complete statement at chapter 9.5 Result of data type
    combinations, ANSI 2013, Part 2: Foundations.
    
    Closes #3937

>> Closes #3937
>> ---
>>  src/box/sql/expr.c          |  47 +++++++++++++++++++-
>>  test/sql/collation.result   | 102 ++++++++++++++++++++++++++++++++++++++++++++
>>  test/sql/collation.test.lua |  46 ++++++++++++++++++++
>>  3 files changed, 193 insertions(+), 2 deletions(-)
>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
>> index f8819f779..e6f536757 100644
>> --- a/src/box/sql/expr.c
>> +++ b/src/box/sql/expr.c
>> @@ -221,6 +221,45 @@ sql_expr_coll(Parse *parse, Expr *p, bool *is_explicit_coll, uint32_t *coll_id)
>>  			}
>>  			break;
>>  		}
>> +		if (op == TK_CONCAT) {
>> +			/*
>> +			 * Despite the fact that procedure below
>> +			 * is very similar to collation_check_compatability(),
>> +			 * it is slightly different: when both
>> +			 * operands have different implicit collations,
>> +			 * derived collation should be "none",
>> +			 * i.e. no collation is used at all
>> +			 * (instead of raising error).
> 
> 4. Typo: collation_check_compatability -> collation*S*_check_compat*I*bility.
> 
> Also, I think that it is not worth mentioning the difference here
> with that function, especially in such a big comment, looks like an excuse.
> It is better to put a link to the standard.

As you wish:

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index e6f536757..031b1f16f 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -223,13 +223,10 @@ sql_expr_coll(Parse *parse, Expr *p, bool *is_explicit_coll, uint32_t *coll_id)
                }
                if (op == TK_CONCAT) {
                        /*
-                        * Despite the fact that procedure below
-                        * is very similar to collation_check_compatability(),
-                        * it is slightly different: when both
-                        * operands have different implicit collations,
-                        * derived collation should be "none",
-                        * i.e. no collation is used at all
-                        * (instead of raising error).
+                        * Procedure below provides compatibility
+                        * checks declared in ANSI SQL 2013:
+                        * chapter 9.5 Result of data type
+                        * combinations.
                         */
>> +			bool is_rhs_forced;
>> +			uint32_t rhs_coll_id;
>> +			if (sql_expr_coll(parse, p->pRight, &is_rhs_forced,
>> +					  &rhs_coll_id) != 0)
>> +				return -1;
>> +			if (is_lhs_forced && is_rhs_forced) {
>> +				if (lhs_coll_id != rhs_coll_id)
>> +					return -1;
> 
> 5. Did you miss diag_set?

No, I did it on purpose. Firstly, this function is recursive,
so in case error occurred on bottom levels of recursion,
diag would be reseted on each level above (and parser’s
counter of errors would be incremented several times as
well). No terrible consequences would take place in this case,
but it looks like a bad pattern. Anyway, fixed it according to
your suggestion.

Secondly, many functions which call sql_expr_coll
don’t expect that it can return real error and they are not
adapted to handle them (e.g. wherePathSatisfiesOrderBy).
In original SQLite this function don’t fail under any circumstances.
Sure, if error is set as Parse->err, sooner or later it will arise
(I guess this is likely to be global problem of SQLite).

Diff (cut from context, sorry):

+                       if (is_lhs_forced && is_rhs_forced) {
+                               if (lhs_coll_id != rhs_coll_id) {
+                                       diag_set(ClientError,
+                                                ER_ILLEGAL_COLLATION_MIX);
+                                       parse->nErr++;
+                                       parse->rc = SQL_TARANTOOL_ERROR;
+                                       return -1;
+                               }
+                       }

>> +			}
>> +			if (is_lhs_forced) {
>> +				*coll_id = lhs_coll_id;
>> +				*is_explicit_coll = true;
>> +				return 0;
> 
> 6. In this function (sql_expr_coll) to break the cycle 'break'
> keyword is used, so lets be consistent and use 'break' as well.

Ok:

@@ -248,17 +245,17 @@ sql_expr_coll(Parse *parse, Expr *p, bool *is_explicit_coll, uint32_t *coll_id)
                        if (is_lhs_forced) {
                                *coll_id = lhs_coll_id;
                                *is_explicit_coll = true;
-                               return 0;
+                               break;
                        }
                        if (is_rhs_forced) {
                                *coll_id = rhs_coll_id;
                                *is_explicit_coll = true;
-                               return 0;
+                               break;
                        }
                        if (rhs_coll_id != lhs_coll_id)
-                               return 0;
+                               break;
                        *coll_id = lhs_coll_id;
-                       return 0;
+                       break;
                }
                if (p->flags & EP_Collate) {
                        if (p->pLeft && (p->pLeft->flags & EP_Collate) != 0) {

>> 
>> @@ -384,10 +423,14 @@ sql_binary_compare_coll_seq(Parse *parser, Expr *left, Expr *right)
>>  	bool is_rhs_forced;
>>  	uint32_t lhs_coll_id;
>>  	uint32_t rhs_coll_id;
>> -	if (sql_expr_coll(parser, left, &is_lhs_forced, &lhs_coll_id) != 0)
>> +	if (sql_expr_coll(parser, left, &is_lhs_forced, &lhs_coll_id) != 0) {
>> +		diag_set(ClientError, ER_ILLEGAL_COLLATION_MIX);
>>  		goto err;
>> -	if (sql_expr_coll(parser, right, &is_rhs_forced, &rhs_coll_id) != 0)
>> +	}
>> +	if (sql_expr_coll(parser, right, &is_rhs_forced, &rhs_coll_id) != 0) {
>> +		diag_set(ClientError, ER_ILLEGAL_COLLATION_MIX);
> 
> 7. Why do you set it here and not on the point 5 above? sql_expr_coll
> can return an error not only because of illegal collation mix, but also
> if a collation does not exist, for example.

Hm, I missed this fact. Ok, I’ve done it, see above.

  reply	other threads:[~2019-02-14 23:26 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-16 13:13 [tarantool-patches] [PATCH 0/2] Compute derived " Nikita Pettik
2019-01-16 13:13 ` [tarantool-patches] [PATCH 1/2] sql: refactor sql_expr_coll and sql_binary_compare_coll_seq functions Nikita Pettik
2019-01-17 13:28   ` [tarantool-patches] " Konstantin Osipov
2019-01-24 18:28   ` Vladislav Shpilevoy
2019-02-14 23:26     ` n.pettik
2019-01-16 13:13 ` [tarantool-patches] [PATCH 2/2] sql: compute resulting collation for concatenation Nikita Pettik
2019-01-17 13:33   ` [tarantool-patches] " Konstantin Osipov
2019-01-17 19:19     ` n.pettik
2019-01-18  9:59     ` Kirill Yukhin
2019-01-24 18:29   ` Vladislav Shpilevoy
2019-02-14 23:26     ` n.pettik [this message]
2019-02-22 11:23       ` Vladislav Shpilevoy
2019-02-22 13:40         ` n.pettik
2019-02-22 13:51           ` Vladislav Shpilevoy
2019-02-25 11:29 ` [tarantool-patches] Re: [PATCH 0/2] Compute derived " Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5AD6A399-D158-44CF-9180-F57185F2C376@tarantool.org \
    --to=korablev@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='[tarantool-patches] Re: [PATCH 2/2] sql: compute resulting collation for concatenation' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox