[Tarantool-patches] [PATCH v1 2/2] sql: fix comparison in IN with subselect

Mergen Imeev imeevma at tarantool.org
Thu Apr 30 13:22:54 MSK 2020


Hi! Thank you for the review! My asnwers and new patch
below.

On Tue, Apr 28, 2020 at 12:53:11AM +0200, Vladislav Shpilevoy wrote:
> Hi! Thanks for the fixes!
> 
> >>> values from subselect contains a value that is not comparable with
> >>> the left value, it throws an error.
> >>>
> >>> Closes #4692
> >>> ---
> >>>  src/box/sql/expr.c                                 | 27 ++++++++++++++++++++++
> >>>  .../gh-4692-comparison-in-IN-operator.test.lua     | 24 ++++++++++++++++++-
> >>>  test/sql-tap/in3.test.lua                          |  2 +-
> >>>  test/sql-tap/subquery.test.lua                     |  8 +++----
> >>>  test/sql-tap/tkt-80e031a00f.test.lua               |  8 +++----
> >>>  test/sql/boolean.result                            | 12 +++++-----
> >>>  6 files changed, 65 insertions(+), 16 deletions(-)
> >>>
> >>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> >>> index 4fe4f83..c20c04b 100644
> >>> --- a/src/box/sql/expr.c
> >>> +++ b/src/box/sql/expr.c
> >>> @@ -2711,6 +2711,28 @@ expr_in_type(Parse *pParse, Expr *pExpr)
> >>>  	return zRet;
> >>>  }
> >>>  
> >>> +static inline bool
> >>> +is_in_type_cmp(struct Parse *parse, enum field_type lhs_type, struct Expr *expr)
> >>
> >> 2. Sorry, can't parse the name. Can't understand what is 'is cmp'.
> >> Could you please rename and add a comment maybe?
> >>
> > Fixed. I gave it a new name, 'is_comparable'. It is not a
> > lot better, still since it is just a static inline function,
> > will this be enough?
> 
> I am afraid it is too common for a huge file like expr.c. Just
> extend the old name "is_in_type_cmp()" -> "are_in_args_comparable()".
> 
Thanks, fixed. I changed name to
are_in_args_types_comparable(). Also, expanded description
of the function. I think it is better to let this function
to check comparability of the types alone.

> > From 946cd381dd0af8ba1b534e9721417423f315a542 Mon Sep 17 00:00:00 2001
> > From: Mergen Imeev <imeevma at gmail.com>
> > Date: Wed, 15 Apr 2020 02:17:43 +0300
> > Subject: [PATCH] sql: fix comparison in IN with subselect
> > 
> > After this patch, the IN statement checks the compatibility of the
> > values from subselect on the right with the value on the left. If
> > values from subselect contains a value that is not comparable with
> > the left value, it throws an error.
> > 
> > Closes #4692
> > 
> > diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> > index 4fe4f83..9cd3a2c 100644
> > --- a/src/box/sql/expr.c
> > +++ b/src/box/sql/expr.c
> > @@ -2711,6 +2711,49 @@ expr_in_type(Parse *pParse, Expr *pExpr)
> >  	return zRet;
> >  }
> >  
> > +/**
> > + * Check that arguments on both sides of IN are comparable.
> > + *
> > + * @param parse Parsing context.
> > + * @param pExpr Expr that contains all operands.
> > + *
> > + * @retval true if comparable, false otherwise.
> > + */
> > +static inline bool
> > +is_comparable(struct Parse *parse, struct Expr *pExpr)
> > +{
> > +	uint32_t size = sqlExprVectorSize(pExpr->pLeft);
> > +	ExprList *lhs_list = NULL;
> > +	ExprList *rhs_list = pExpr->x.pSelect->pEList;
> > +	u8 op = pExpr->pLeft->op;
> > +	if (op == TK_REGISTER)
> > +		op = pExpr->pLeft->op2;
> > +	if (op == TK_VECTOR)
> > +		lhs_list = pExpr->pLeft->x.pList;
> > +	else if (op == TK_SELECT)
> > +		lhs_list = pExpr->pLeft->x.pSelect->pEList;
> > +	assert(size == 1 || lhs_list != NULL);
> 
> What is happening on the lines above?
> 
Before working with the left value, we need to find out if
it is a scalar, vector, or subselect. In the case of a
vector or a subselect, we should get a list of expressions
containing information about each field in the left value.
We will use this list to check comparability.

> Also, I see similar code below is_comparable() invocation:
> 
> 	for (i = 0; i < nVal; i++) {
> 		Expr *p =
> 		    sqlVectorFieldSubexpr
> 		    (pLeft, i);
> 		if (sql_binary_compare_coll_seq(pParse, p, pEList->a[i].pExpr,
> 						&key_info->parts[i].coll_id) != 0)
> 			return 0;
> 	}
> 
> It is ugly, but the point is they use sqlVectorFieldSubexpr(pLeft, i)
> without any 'TK_VECTOR', 'TK_SELECT', 'TK_REGISTER' checks. How
> does it work?
We do not need to check here since we already know that
the right side is a subselect.

> And shouldn't we move this cycle into is_comparable(),
> because seems like it also checks if types are comparable when they
> are strings. However I am not sure.
> 
It is true that compability of collations checked here, but
it returns ID of collation that should be selected after
this check. This id is used during creation of format and
index of ephemeral space.

> > +
> > +	for (uint32_t i = 0; i < size; ++i) {
> > +		struct Expr *lhs = lhs_list == NULL ? pExpr->pLeft :
> > +				   lhs_list->a[i].pExpr;
> > +		struct Expr *rhs = rhs_list->a[i].pExpr;
> > +		enum field_type lhs_type = sql_expr_type(lhs);
> > +		enum field_type rhs_type = sql_expr_type(rhs);
> > +		if (field_type1_contains_type2(lhs_type, rhs_type) ||
> > +		    field_type1_contains_type2(rhs_type, lhs_type))
> > +			continue;
> > +		if (sql_type_is_numeric(rhs_type) &&
> > +		    sql_type_is_numeric(lhs_type))
> > +			continue;
> > +		parse->is_aborted = true;
> > +		diag_set(ClientError, ER_SQL_TYPE_MISMATCH,
> > +			 field_type_strs[rhs_type], field_type_strs[lhs_type]);
> > +		return false;
> > +	}
> > +	return true;
> > +}


New patch:


>From 495d3c0b9fdb17ca99a4cae7528fa448d4ea07ac Mon Sep 17 00:00:00 2001
From: Mergen Imeev <imeevma at gmail.com>
Date: Wed, 15 Apr 2020 02:17:43 +0300
Subject: [PATCH] sql: fix comparison in IN with subselect

After this patch, the IN statement checks the compatibility of the
values from subselect on the right with the value on the left. If
values from subselect contains a value that is not comparable with
the left value, it throws an error.

Closes #4692

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 4fe4f83..59d128f 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -2711,6 +2711,51 @@ expr_in_type(Parse *pParse, Expr *pExpr)
 	return zRet;
 }
 
+/**
+ * Check that field types in arguments on both sides of IN are
+ * comparable. Note that it does not check that fields of type
+ * string have comparable collations.
+ *
+ * @param parse Parsing context.
+ * @param pExpr Expr that contains all operands.
+ *
+ * @retval true if comparable, false otherwise.
+ */
+static inline bool
+are_in_args_types_comparable(struct Parse *parse, struct Expr *pExpr)
+{
+	uint32_t size = sqlExprVectorSize(pExpr->pLeft);
+	ExprList *lhs_list = NULL;
+	ExprList *rhs_list = pExpr->x.pSelect->pEList;
+	u8 op = pExpr->pLeft->op;
+	if (op == TK_REGISTER)
+		op = pExpr->pLeft->op2;
+	if (op == TK_VECTOR)
+		lhs_list = pExpr->pLeft->x.pList;
+	else if (op == TK_SELECT)
+		lhs_list = pExpr->pLeft->x.pSelect->pEList;
+	assert(size == 1 || lhs_list != NULL);
+
+	for (uint32_t i = 0; i < size; ++i) {
+		struct Expr *lhs = lhs_list == NULL ? pExpr->pLeft :
+				   lhs_list->a[i].pExpr;
+		struct Expr *rhs = rhs_list->a[i].pExpr;
+		enum field_type lhs_type = sql_expr_type(lhs);
+		enum field_type rhs_type = sql_expr_type(rhs);
+		if (field_type1_contains_type2(lhs_type, rhs_type) ||
+		    field_type1_contains_type2(rhs_type, lhs_type))
+			continue;
+		if (sql_type_is_numeric(rhs_type) &&
+		    sql_type_is_numeric(lhs_type))
+			continue;
+		parse->is_aborted = true;
+		diag_set(ClientError, ER_SQL_TYPE_MISMATCH,
+			 field_type_strs[rhs_type], field_type_strs[lhs_type]);
+		return false;
+	}
+	return true;
+}
+
 /*
  * Generate code for scalar subqueries used as a subquery expression, EXISTS,
  * or IN operators.  Examples:
@@ -2821,6 +2866,8 @@ sqlCodeSubselect(Parse * pParse,	/* Parsing context */
 							      pExpr->iTable, reg_eph);
 					dest.dest_type =
 						expr_in_type(pParse, pExpr);
+					if (!are_in_args_types_comparable(pParse, pExpr))
+						return 0;
 					assert((pExpr->iTable & 0x0000FFFF) ==
 					       pExpr->iTable);
 					pSelect->iLimit = 0;
diff --git a/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua b/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua
index b1d8a5f..a03b688 100755
--- a/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua
+++ b/test/sql-tap/gh-4692-comparison-in-IN-operator.test.lua
@@ -1,6 +1,6 @@
 #!/usr/bin/env tarantool
 test = require("sqltester")
-test:plan(2)
+test:plan(11)
 
 --
 -- If left value of IN and NOT IN operators is not a vector with
@@ -24,5 +24,82 @@ test:do_catchsql_test(
         1, "Type mismatch: can not convert 3 to varbinary"
     })
 
+test:do_catchsql_test(
+    "gh-4692-2.1",
+    [[
+        SELECT true IN (SELECT X'31');
+    ]], {
+        1, "Type mismatch: can not convert varbinary to boolean"
+    })
+
+test:do_catchsql_test(
+    "gh-4692-2.2",
+    [[
+        SELECT X'31' IN (SELECT true);
+    ]], {
+        1, "Type mismatch: can not convert boolean to varbinary"
+    })
+
+test:do_catchsql_test(
+    "gh-4692-2.3",
+    [[
+        SELECT (1, 'a') IN (SELECT 1, 2);
+    ]], {
+        1, "Type mismatch: can not convert integer to string"
+    })
+
+test:do_catchsql_test(
+    "gh-4692-2.4",
+    [[
+        SELECT (SELECT 1, 'a') IN (SELECT 1, 2);
+    ]], {
+        1, "Type mismatch: can not convert integer to string"
+    })
+
+test:execsql([[
+        CREATE TABLE t (i INT PRIMARY KEY, a BOOLEAN, b VARBINARY);
+        INSERT INTO t VALUES(1, true, X'31');
+    ]])
+
+test:do_catchsql_test(
+    "gh-4692-2.5",
+    [[
+        SELECT true IN (SELECT b FROM t);
+    ]], {
+        1, "Type mismatch: can not convert varbinary to boolean"
+    })
+
+test:do_catchsql_test(
+    "gh-4692-2.6",
+    [[
+        SELECT X'31' IN (SELECT a FROM t);
+    ]], {
+        1, "Type mismatch: can not convert boolean to varbinary"
+    })
+
+test:do_catchsql_test(
+    "gh-4692-2.7",
+    [[
+        SELECT (1, 'a') IN (SELECT i, a from t);
+    ]], {
+        1, "Type mismatch: can not convert boolean to string"
+    })
+
+test:do_catchsql_test(
+    "gh-4692-2.8",
+    [[
+        SELECT (SELECT 1, 'a') IN (SELECT i, a from t);
+    ]], {
+        1, "Type mismatch: can not convert boolean to string"
+    })
+
+test:do_catchsql_test(
+    "gh-4692-2.9",
+    [[
+        SELECT (SELECT i, a from t) IN (SELECT a, i from t);
+    ]], {
+        1, "Type mismatch: can not convert boolean to integer"
+    })
+
 test:finish_test()
 
diff --git a/test/sql-tap/in3.test.lua b/test/sql-tap/in3.test.lua
index e29db9d..e3f8682 100755
--- a/test/sql-tap/in3.test.lua
+++ b/test/sql-tap/in3.test.lua
@@ -373,9 +373,7 @@ test:do_test(
 test:do_test(
     "in3-3.7",
     function()
-        -- Numeric affinity is applied before the comparison takes place. 
-        -- Making it impossible to use index t1_i3.
-        return exec_neph(" SELECT y IN (SELECT c FROM t1) FROM t2 ")
+        return exec_neph(" SELECT y IN (SELECT CAST(c AS INTEGER) FROM t1) FROM t2 ")
     end, {
         -- <in3-3.7>
         1, true
diff --git a/test/sql-tap/subquery.test.lua b/test/sql-tap/subquery.test.lua
index 15c4c82..527ea2b 100755
--- a/test/sql-tap/subquery.test.lua
+++ b/test/sql-tap/subquery.test.lua
@@ -352,33 +352,25 @@ test:do_execsql_test(
 test:do_test(
     "subquery-2.5.2",
     function()
-        -- In the expr "x IN (SELECT a FROM t3)" the RHS of the IN operator
-        -- has text affinity and the LHS has integer affinity.  The rule is
-        -- that we try to convert both sides to an integer before doing the
-        -- comparision. Hence, the integer value 10 in t3 will compare equal
-        -- to the string value '10.0' in t4 because the t4 value will be
-        -- converted into an integer.
-        return test:execsql [[
+        return test:catchsql [[
             SELECT * FROM t4 WHERE x IN (SELECT a FROM t3);
         ]]
     end, {
         -- <subquery-2.5.2>
-        "10"
+        1, "Type mismatch: can not convert integer to string"
         -- </subquery-2.5.2>
     })
 
 test:do_test(
     "subquery-2.5.3.1",
     function()
-        -- The t4i index cannot be used to resolve the "x IN (...)" constraint
-        -- because the constraint has integer affinity but t4i has text affinity.
-        return test:execsql [[
+        return test:catchsql [[
             CREATE INDEX t4i ON t4(x);
             SELECT * FROM t4 WHERE x IN (SELECT a FROM t3);
         ]]
     end, {
         -- <subquery-2.5.3.1>
-        "10"
+        1, "Type mismatch: can not convert integer to string"
         -- </subquery-2.5.3.1>
     })
 
diff --git a/test/sql-tap/tkt-80e031a00f.test.lua b/test/sql-tap/tkt-80e031a00f.test.lua
index a0e6539..2e0a921 100755
--- a/test/sql-tap/tkt-80e031a00f.test.lua
+++ b/test/sql-tap/tkt-80e031a00f.test.lua
@@ -346,7 +346,7 @@ test:do_catchsql_test(
         SELECT 'hello' IN t1
     ]], {
         -- <tkt-80e031a00f.27>
-        1, 'Type mismatch: can not convert hello to integer'
+        1, 'Type mismatch: can not convert integer to string'
         -- </tkt-80e031a00f.27>
     })
 
@@ -356,7 +356,7 @@ test:do_catchsql_test(
         SELECT 'hello' NOT IN t1
     ]], {
         -- <tkt-80e031a00f.28>
-        1, 'Type mismatch: can not convert hello to integer'
+        1, 'Type mismatch: can not convert integer to string'
         -- </tkt-80e031a00f.28>
     })
 
@@ -386,7 +386,7 @@ test:do_catchsql_test(
         SELECT x'303132' IN t1
     ]], {
         -- <tkt-80e031a00f.31>
-        1, 'Type mismatch: can not convert varbinary to integer'
+        1, 'Type mismatch: can not convert integer to varbinary'
         -- </tkt-80e031a00f.31>
     })
 
@@ -396,7 +396,7 @@ test:do_catchsql_test(
         SELECT x'303132' NOT IN t1
     ]], {
         -- <tkt-80e031a00f.32>
-        1, 'Type mismatch: can not convert varbinary to integer'
+        1, 'Type mismatch: can not convert integer to varbinary'
         -- </tkt-80e031a00f.32>
     })
 
diff --git a/test/sql/boolean.result b/test/sql/boolean.result
index b893f2a..112f0ac 100644
--- a/test/sql/boolean.result
+++ b/test/sql/boolean.result
@@ -3859,12 +3859,12 @@ SELECT false IN (0, 1, 2, 3);
 SELECT true IN (SELECT b FROM t7);
  | ---
  | - null
- | - 'Type mismatch: can not convert TRUE to integer'
+ | - 'Type mismatch: can not convert integer to boolean'
  | ...
 SELECT false IN (SELECT b FROM t7);
  | ---
  | - null
- | - 'Type mismatch: can not convert FALSE to integer'
+ | - 'Type mismatch: can not convert integer to boolean'
  | ...
 SELECT a1, a1 IN (0, 1, 2, 3) FROM t6
  | ---
@@ -5002,22 +5002,22 @@ SELECT a2 IN (0.1, 1.2, 2.3, 3.4) FROM t6 LIMIT 1;
 SELECT true IN (SELECT c FROM t8);
  | ---
  | - null
- | - 'Type mismatch: can not convert TRUE to number'
+ | - 'Type mismatch: can not convert number to boolean'
  | ...
 SELECT false IN (SELECT c FROM t8);
  | ---
  | - null
- | - 'Type mismatch: can not convert FALSE to number'
+ | - 'Type mismatch: can not convert number to boolean'
  | ...
 SELECT a1 IN (SELECT c FROM t8) FROM t6 LIMIT 1;
  | ---
  | - null
- | - 'Type mismatch: can not convert FALSE to number'
+ | - 'Type mismatch: can not convert number to boolean'
  | ...
 SELECT a2 IN (SELECT c FROM t8) FROM t6 LIMIT 1;
  | ---
  | - null
- | - 'Type mismatch: can not convert TRUE to number'
+ | - 'Type mismatch: can not convert number to boolean'
  | ...
 
 SELECT true BETWEEN 0.1 and 9.9;


More information about the Tarantool-patches mailing list