From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp54.i.mail.ru (smtp54.i.mail.ru [217.69.128.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 745614696C3 for ; Wed, 22 Apr 2020 13:03:07 +0300 (MSK) Date: Wed, 22 Apr 2020 13:03:04 +0300 From: Mergen Imeev Message-ID: <20200422100303.GA29050@tarantool.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH v1 2/2] sql: fix comparison in IN with subselect List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy Cc: tarantool-patches@dev.tarantool.org Hi! Thank you for review. My answers, diff and new patch below. On Sun, Apr 19, 2020 at 07:47:23PM +0200, Vladislav Shpilevoy wrote: > Hi! Thanks for the patch! > > See 6 comments below. > > On 15/04/2020 13:47, imeevma@tarantool.org wrote: > > After this patch, the IN statement checks the compatibility of the > > values from subselect ​​on the right with the value on the left. If > > 1. I see broken unicode after 'subselect' on this line, when I do > 'git log'. > Fixed. > > values from subselect contains a value that is not comparable with > > the left value, it throws an error. > > > > Closes #4692 > > --- > > src/box/sql/expr.c | 27 ++++++++++++++++++++++ > > .../gh-4692-comparison-in-IN-operator.test.lua | 24 ++++++++++++++++++- > > test/sql-tap/in3.test.lua | 2 +- > > test/sql-tap/subquery.test.lua | 8 +++---- > > test/sql-tap/tkt-80e031a00f.test.lua | 8 +++---- > > test/sql/boolean.result | 12 +++++----- > > 6 files changed, 65 insertions(+), 16 deletions(-) > > > > diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c > > index 4fe4f83..c20c04b 100644 > > --- a/src/box/sql/expr.c > > +++ b/src/box/sql/expr.c > > @@ -2711,6 +2711,28 @@ expr_in_type(Parse *pParse, Expr *pExpr) > > return zRet; > > } > > > > +static inline bool > > +is_in_type_cmp(struct Parse *parse, enum field_type lhs_type, struct Expr *expr) > > 2. Sorry, can't parse the name. Can't understand what is 'is cmp'. > Could you please rename and add a comment maybe? > Fixed. I gave it a new name, 'is_comparable'. It is not a lot better, still since it is just a static inline function, will this be enough? > > +{ > > + uint32_t fieldno = expr->iColumn; > > + enum field_type rhs_type = expr->space_def == NULL ? > > + rhs_type = expr->type : > > 3. You basically wrote 'rhs_type = (rhs_type = expr->type)'. You > may need to remove the second 'rhs_type = '. > Sorry, fixed. > > + expr->space_def->fields[fieldno].type; > > + if (rhs_type == lhs_type) > > + return true; > > + if (rhs_type == FIELD_TYPE_ANY || lhs_type == FIELD_TYPE_ANY) > > + return true; > > + if (rhs_type == FIELD_TYPE_SCALAR || lhs_type == FIELD_TYPE_SCALAR) > > + return true; > > + if (sql_type_is_numeric(rhs_type) && sql_type_is_numeric(lhs_type)) > > + return true; > > + parse->is_aborted = true; > > + diag_set(ClientError, ER_SQL_TYPE_MISMATCH, field_type_strs[rhs_type], > > + field_type_strs[lhs_type]); > > 4. Wouldn't it be the same to check > > field_type1_contains_type2(lhs_type, rhs_type) || > field_type1_contains_type2(rhs_type, lhs_type) > > ? > Not exactly, since we can compare INTEGERs with DOUBLEs. However, thank you for this suggestion! > > + return false; > > +} > > + > > + > > /* > > * Generate code for scalar subqueries used as a subquery expression, EXISTS, > > * or IN operators. Examples: > > @@ -2821,6 +2843,11 @@ sqlCodeSubselect(Parse * pParse, /* Parsing context */ > > pExpr->iTable, reg_eph); > > dest.dest_type = > > expr_in_type(pParse, pExpr); > > + if (nVal == 1 && > > 5. What if it is not 1? How types are checked then? > Fixed. Nikita suggested to do this only to values whose length is 1 since it should be simpler. Still, it appears to be not harder to check all fields. Also, I found out that my previous solution didn't worked as intended in case there is a vector or a SELECT on the left of IN. > > + !is_in_type_cmp(pParse, > > + sql_expr_type(pLeft), > > + pEList->a[0].pExpr)) > > + return 0; > > assert((pExpr->iTable & 0x0000FFFF) == > > pExpr->iTable); > > pSelect->iLimit = 0; > > diff --git a/test/sql-tap/subquery.test.lua b/test/sql-tap/subquery.test.lua > > index 6bedf58..e03a6a0 100755 > > --- a/test/sql-tap/subquery.test.lua > > +++ b/test/sql-tap/subquery.test.lua > > @@ -358,12 +358,12 @@ test:do_test( > > -- comparision. Hence, the integer value 10 in t3 will compare equal > > -- to the string value '10.0' in t4 because the t4 value will be > > -- converted into an integer. > > - return test:execsql [[ > > + return test:catchsql [[ > > 6. The comment above is outdated. Probably better to remove this test, > since it no longer serves its purpose. Or change the comment. Please, > check other changed tests too. Maybe they also became irrelevant or their > comments are misleading. I removed comments for all test I changed since they all were outdated. Diff: commit 3c01abf7d969dfaf72b3798269032f32b6d74356 Author: Mergen Imeev Date: Wed Apr 22 10:52:46 2020 +0300 Review fix diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c index c20c04b..9cd3a2c 100644 --- a/src/box/sql/expr.c +++ b/src/box/sql/expr.c @@ -2711,28 +2711,49 @@ expr_in_type(Parse *pParse, Expr *pExpr) return zRet; } +/** + * Check that arguments on both sides of IN are comparable. + * + * @param parse Parsing context. + * @param pExpr Expr that contains all operands. + * + * @retval true if comparable, false otherwise. + */ static inline bool -is_in_type_cmp(struct Parse *parse, enum field_type lhs_type, struct Expr *expr) +is_comparable(struct Parse *parse, struct Expr *pExpr) { - uint32_t fieldno = expr->iColumn; - enum field_type rhs_type = expr->space_def == NULL ? - rhs_type = expr->type : - expr->space_def->fields[fieldno].type; - if (rhs_type == lhs_type) - return true; - if (rhs_type == FIELD_TYPE_ANY || lhs_type == FIELD_TYPE_ANY) - return true; - if (rhs_type == FIELD_TYPE_SCALAR || lhs_type == FIELD_TYPE_SCALAR) - return true; - if (sql_type_is_numeric(rhs_type) && sql_type_is_numeric(lhs_type)) - return true; - parse->is_aborted = true; - diag_set(ClientError, ER_SQL_TYPE_MISMATCH, field_type_strs[rhs_type], - field_type_strs[lhs_type]); - return false; + uint32_t size = sqlExprVectorSize(pExpr->pLeft); + ExprList *lhs_list = NULL; + ExprList *rhs_list = pExpr->x.pSelect->pEList; + u8 op = pExpr->pLeft->op; + if (op == TK_REGISTER) + op = pExpr->pLeft->op2; + if (op == TK_VECTOR) + lhs_list = pExpr->pLeft->x.pList; + else if (op == TK_SELECT) + lhs_list = pExpr->pLeft->x.pSelect->pEList; + assert(size == 1 || lhs_list != NULL); + + for (uint32_t i = 0; i < size; ++i) { + struct Expr *lhs = lhs_list == NULL ? pExpr->pLeft : + lhs_list->a[i].pExpr; + struct Expr *rhs = rhs_list->a[i].pExpr; + enum field_type lhs_type = sql_expr_type(lhs); + enum field_type rhs_type = sql_expr_type(rhs); + if (field_type1_contains_type2(lhs_type, rhs_type) || + field_type1_contains_type2(rhs_type, lhs_type)) + continue; + if (sql_type_is_numeric(rhs_type) && + sql_type_is_numeric(lhs_type)) + continue; + parse->is_aborted = true; + diag_set(ClientError, ER_SQL_TYPE_MISMATCH, + field_type_strs[rhs_type], field_type_strs[lhs_type]); + return false; + } + return true; } - /* * Generate code for scalar subqueries used as a subquery expression, EXISTS, * or IN operators. Examples: @@ -2843,10 +2864,7 @@ sqlCodeSubselect(Parse * pParse, /* Parsing context */ pExpr->iTable, reg_eph); dest.dest_type = expr_in_type(pParse, pExpr); - if (nVal == 1 && - !is_in_type_cmp(pParse, - sql_expr_type(pLeft), - pEList->a[0].pExpr)) + if (!is_comparable(pParse, pExpr)) return 0; assert((pExpr->iTable & 0x0000FFFF) == pExpr->iTable); diff --git a/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua b/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua index ef76290..a03b688 100755 --- a/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua +++ b/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua @@ -1,6 +1,6 @@ #!/usr/bin/env tarantool test = require("sqltester") -test:plan(4) +test:plan(11) -- -- If left value of IN and NOT IN operators is not a vector with @@ -25,25 +25,80 @@ test:do_catchsql_test( }) test:do_catchsql_test( - "gh-3692-2.1", + "gh-4692-2.1", [[ - CREATE TABLE t(i INT PRIMARY KEY, a BOOLEAN, b VARBINARY); - INSERT INTO t VALUES (1, true, X'31'); - SELECT true in (SELECT b FROM t); + SELECT true IN (SELECT X'31'); ]], { - -- 1, "Type mismatch: can not convert varbinary to boolean" - -- }) test:do_catchsql_test( - "gh-3692-2.2", + "gh-4692-2.2", [[ - SELECT X'31' in (SELECT a FROM t); + SELECT X'31' IN (SELECT true); ]], { - -- 1, "Type mismatch: can not convert boolean to varbinary" - -- + }) + +test:do_catchsql_test( + "gh-4692-2.3", + [[ + SELECT (1, 'a') IN (SELECT 1, 2); + ]], { + 1, "Type mismatch: can not convert integer to string" + }) + +test:do_catchsql_test( + "gh-4692-2.4", + [[ + SELECT (SELECT 1, 'a') IN (SELECT 1, 2); + ]], { + 1, "Type mismatch: can not convert integer to string" + }) + +test:execsql([[ + CREATE TABLE t (i INT PRIMARY KEY, a BOOLEAN, b VARBINARY); + INSERT INTO t VALUES(1, true, X'31'); + ]]) + +test:do_catchsql_test( + "gh-4692-2.5", + [[ + SELECT true IN (SELECT b FROM t); + ]], { + 1, "Type mismatch: can not convert varbinary to boolean" + }) + +test:do_catchsql_test( + "gh-4692-2.6", + [[ + SELECT X'31' IN (SELECT a FROM t); + ]], { + 1, "Type mismatch: can not convert boolean to varbinary" + }) + +test:do_catchsql_test( + "gh-4692-2.7", + [[ + SELECT (1, 'a') IN (SELECT i, a from t); + ]], { + 1, "Type mismatch: can not convert boolean to string" + }) + +test:do_catchsql_test( + "gh-4692-2.8", + [[ + SELECT (SELECT 1, 'a') IN (SELECT i, a from t); + ]], { + 1, "Type mismatch: can not convert boolean to string" + }) + +test:do_catchsql_test( + "gh-4692-2.9", + [[ + SELECT (SELECT i, a from t) IN (SELECT a, i from t); + ]], { + 1, "Type mismatch: can not convert boolean to integer" }) test:finish_test() diff --git a/test/sql-tap/in3.test.lua b/test/sql-tap/in3.test.lua index e4886f7..e3f8682 100755 --- a/test/sql-tap/in3.test.lua +++ b/test/sql-tap/in3.test.lua @@ -373,8 +373,6 @@ test:do_test( test:do_test( "in3-3.7", function() - -- Numeric affinity is applied before the comparison takes place. - -- Making it impossible to use index t1_i3. return exec_neph(" SELECT y IN (SELECT CAST(c AS INTEGER) FROM t1) FROM t2 ") end, { -- diff --git a/test/sql-tap/subquery.test.lua b/test/sql-tap/subquery.test.lua index a5a8e95..527ea2b 100755 --- a/test/sql-tap/subquery.test.lua +++ b/test/sql-tap/subquery.test.lua @@ -352,12 +352,6 @@ test:do_execsql_test( test:do_test( "subquery-2.5.2", function() - -- In the expr "x IN (SELECT a FROM t3)" the RHS of the IN operator - -- has text affinity and the LHS has integer affinity. The rule is - -- that we try to convert both sides to an integer before doing the - -- comparision. Hence, the integer value 10 in t3 will compare equal - -- to the string value '10.0' in t4 because the t4 value will be - -- converted into an integer. return test:catchsql [[ SELECT * FROM t4 WHERE x IN (SELECT a FROM t3); ]] @@ -370,8 +364,6 @@ test:do_test( test:do_test( "subquery-2.5.3.1", function() - -- The t4i index cannot be used to resolve the "x IN (...)" constraint - -- because the constraint has integer affinity but t4i has text affinity. return test:catchsql [[ CREATE INDEX t4i ON t4(x); SELECT * FROM t4 WHERE x IN (SELECT a FROM t3); New patch: >From 946cd381dd0af8ba1b534e9721417423f315a542 Mon Sep 17 00:00:00 2001 From: Mergen Imeev Date: Wed, 15 Apr 2020 02:17:43 +0300 Subject: [PATCH] sql: fix comparison in IN with subselect After this patch, the IN statement checks the compatibility of the values from subselect on the right with the value on the left. If values from subselect contains a value that is not comparable with the left value, it throws an error. Closes #4692 diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c index 4fe4f83..9cd3a2c 100644 --- a/src/box/sql/expr.c +++ b/src/box/sql/expr.c @@ -2711,6 +2711,49 @@ expr_in_type(Parse *pParse, Expr *pExpr) return zRet; } +/** + * Check that arguments on both sides of IN are comparable. + * + * @param parse Parsing context. + * @param pExpr Expr that contains all operands. + * + * @retval true if comparable, false otherwise. + */ +static inline bool +is_comparable(struct Parse *parse, struct Expr *pExpr) +{ + uint32_t size = sqlExprVectorSize(pExpr->pLeft); + ExprList *lhs_list = NULL; + ExprList *rhs_list = pExpr->x.pSelect->pEList; + u8 op = pExpr->pLeft->op; + if (op == TK_REGISTER) + op = pExpr->pLeft->op2; + if (op == TK_VECTOR) + lhs_list = pExpr->pLeft->x.pList; + else if (op == TK_SELECT) + lhs_list = pExpr->pLeft->x.pSelect->pEList; + assert(size == 1 || lhs_list != NULL); + + for (uint32_t i = 0; i < size; ++i) { + struct Expr *lhs = lhs_list == NULL ? pExpr->pLeft : + lhs_list->a[i].pExpr; + struct Expr *rhs = rhs_list->a[i].pExpr; + enum field_type lhs_type = sql_expr_type(lhs); + enum field_type rhs_type = sql_expr_type(rhs); + if (field_type1_contains_type2(lhs_type, rhs_type) || + field_type1_contains_type2(rhs_type, lhs_type)) + continue; + if (sql_type_is_numeric(rhs_type) && + sql_type_is_numeric(lhs_type)) + continue; + parse->is_aborted = true; + diag_set(ClientError, ER_SQL_TYPE_MISMATCH, + field_type_strs[rhs_type], field_type_strs[lhs_type]); + return false; + } + return true; +} + /* * Generate code for scalar subqueries used as a subquery expression, EXISTS, * or IN operators. Examples: @@ -2821,6 +2864,8 @@ sqlCodeSubselect(Parse * pParse, /* Parsing context */ pExpr->iTable, reg_eph); dest.dest_type = expr_in_type(pParse, pExpr); + if (!is_comparable(pParse, pExpr)) + return 0; assert((pExpr->iTable & 0x0000FFFF) == pExpr->iTable); pSelect->iLimit = 0; diff --git a/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua b/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua index b1d8a5f..a03b688 100755 --- a/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua +++ b/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua @@ -1,6 +1,6 @@ #!/usr/bin/env tarantool test = require("sqltester") -test:plan(2) +test:plan(11) -- -- If left value of IN and NOT IN operators is not a vector with @@ -24,5 +24,82 @@ test:do_catchsql_test( 1, "Type mismatch: can not convert 3 to varbinary" }) +test:do_catchsql_test( + "gh-4692-2.1", + [[ + SELECT true IN (SELECT X'31'); + ]], { + 1, "Type mismatch: can not convert varbinary to boolean" + }) + +test:do_catchsql_test( + "gh-4692-2.2", + [[ + SELECT X'31' IN (SELECT true); + ]], { + 1, "Type mismatch: can not convert boolean to varbinary" + }) + +test:do_catchsql_test( + "gh-4692-2.3", + [[ + SELECT (1, 'a') IN (SELECT 1, 2); + ]], { + 1, "Type mismatch: can not convert integer to string" + }) + +test:do_catchsql_test( + "gh-4692-2.4", + [[ + SELECT (SELECT 1, 'a') IN (SELECT 1, 2); + ]], { + 1, "Type mismatch: can not convert integer to string" + }) + +test:execsql([[ + CREATE TABLE t (i INT PRIMARY KEY, a BOOLEAN, b VARBINARY); + INSERT INTO t VALUES(1, true, X'31'); + ]]) + +test:do_catchsql_test( + "gh-4692-2.5", + [[ + SELECT true IN (SELECT b FROM t); + ]], { + 1, "Type mismatch: can not convert varbinary to boolean" + }) + +test:do_catchsql_test( + "gh-4692-2.6", + [[ + SELECT X'31' IN (SELECT a FROM t); + ]], { + 1, "Type mismatch: can not convert boolean to varbinary" + }) + +test:do_catchsql_test( + "gh-4692-2.7", + [[ + SELECT (1, 'a') IN (SELECT i, a from t); + ]], { + 1, "Type mismatch: can not convert boolean to string" + }) + +test:do_catchsql_test( + "gh-4692-2.8", + [[ + SELECT (SELECT 1, 'a') IN (SELECT i, a from t); + ]], { + 1, "Type mismatch: can not convert boolean to string" + }) + +test:do_catchsql_test( + "gh-4692-2.9", + [[ + SELECT (SELECT i, a from t) IN (SELECT a, i from t); + ]], { + 1, "Type mismatch: can not convert boolean to integer" + }) + test:finish_test() diff --git a/test/sql-tap/in3.test.lua b/test/sql-tap/in3.test.lua index e29db9d..e3f8682 100755 --- a/test/sql-tap/in3.test.lua +++ b/test/sql-tap/in3.test.lua @@ -373,9 +373,7 @@ test:do_test( test:do_test( "in3-3.7", function() - -- Numeric affinity is applied before the comparison takes place. - -- Making it impossible to use index t1_i3. - return exec_neph(" SELECT y IN (SELECT c FROM t1) FROM t2 ") + return exec_neph(" SELECT y IN (SELECT CAST(c AS INTEGER) FROM t1) FROM t2 ") end, { -- 1, true diff --git a/test/sql-tap/subquery.test.lua b/test/sql-tap/subquery.test.lua index 15c4c82..527ea2b 100755 --- a/test/sql-tap/subquery.test.lua +++ b/test/sql-tap/subquery.test.lua @@ -352,33 +352,25 @@ test:do_execsql_test( test:do_test( "subquery-2.5.2", function() - -- In the expr "x IN (SELECT a FROM t3)" the RHS of the IN operator - -- has text affinity and the LHS has integer affinity. The rule is - -- that we try to convert both sides to an integer before doing the - -- comparision. Hence, the integer value 10 in t3 will compare equal - -- to the string value '10.0' in t4 because the t4 value will be - -- converted into an integer. - return test:execsql [[ + return test:catchsql [[ SELECT * FROM t4 WHERE x IN (SELECT a FROM t3); ]] end, { -- - "10" + 1, "Type mismatch: can not convert integer to string" -- }) test:do_test( "subquery-2.5.3.1", function() - -- The t4i index cannot be used to resolve the "x IN (...)" constraint - -- because the constraint has integer affinity but t4i has text affinity. - return test:execsql [[ + return test:catchsql [[ CREATE INDEX t4i ON t4(x); SELECT * FROM t4 WHERE x IN (SELECT a FROM t3); ]] end, { -- - "10" + 1, "Type mismatch: can not convert integer to string" -- }) diff --git a/test/sql-tap/tkt-80e031a00f.test.lua b/test/sql-tap/tkt-80e031a00f.test.lua index a0e6539..2e0a921 100755 --- a/test/sql-tap/tkt-80e031a00f.test.lua +++ b/test/sql-tap/tkt-80e031a00f.test.lua @@ -346,7 +346,7 @@ test:do_catchsql_test( SELECT 'hello' IN t1 ]], { -- - 1, 'Type mismatch: can not convert hello to integer' + 1, 'Type mismatch: can not convert integer to string' -- }) @@ -356,7 +356,7 @@ test:do_catchsql_test( SELECT 'hello' NOT IN t1 ]], { -- - 1, 'Type mismatch: can not convert hello to integer' + 1, 'Type mismatch: can not convert integer to string' -- }) @@ -386,7 +386,7 @@ test:do_catchsql_test( SELECT x'303132' IN t1 ]], { -- - 1, 'Type mismatch: can not convert varbinary to integer' + 1, 'Type mismatch: can not convert integer to varbinary' -- }) @@ -396,7 +396,7 @@ test:do_catchsql_test( SELECT x'303132' NOT IN t1 ]], { -- - 1, 'Type mismatch: can not convert varbinary to integer' + 1, 'Type mismatch: can not convert integer to varbinary' -- }) diff --git a/test/sql/boolean.result b/test/sql/boolean.result index b893f2a..112f0ac 100644 --- a/test/sql/boolean.result +++ b/test/sql/boolean.result @@ -3859,12 +3859,12 @@ SELECT false IN (0, 1, 2, 3); SELECT true IN (SELECT b FROM t7); | --- | - null - | - 'Type mismatch: can not convert TRUE to integer' + | - 'Type mismatch: can not convert integer to boolean' | ... SELECT false IN (SELECT b FROM t7); | --- | - null - | - 'Type mismatch: can not convert FALSE to integer' + | - 'Type mismatch: can not convert integer to boolean' | ... SELECT a1, a1 IN (0, 1, 2, 3) FROM t6 | --- @@ -5002,22 +5002,22 @@ SELECT a2 IN (0.1, 1.2, 2.3, 3.4) FROM t6 LIMIT 1; SELECT true IN (SELECT c FROM t8); | --- | - null - | - 'Type mismatch: can not convert TRUE to number' + | - 'Type mismatch: can not convert number to boolean' | ... SELECT false IN (SELECT c FROM t8); | --- | - null - | - 'Type mismatch: can not convert FALSE to number' + | - 'Type mismatch: can not convert number to boolean' | ... SELECT a1 IN (SELECT c FROM t8) FROM t6 LIMIT 1; | --- | - null - | - 'Type mismatch: can not convert FALSE to number' + | - 'Type mismatch: can not convert number to boolean' | ... SELECT a2 IN (SELECT c FROM t8) FROM t6 LIMIT 1; | --- | - null - | - 'Type mismatch: can not convert TRUE to number' + | - 'Type mismatch: can not convert number to boolean' | ... SELECT true BETWEEN 0.1 and 9.9;