[tarantool-patches] Re: [PATCH 3/6] sql: pass true types of columns to Tarantool

Fri Nov 2 05:36:47 MSK 2018

> On 30 Oct 2018, at 00:32, Vladislav Shpilevoy <v.shpilevoy at tarantool.org> wrote:
> 
> Hi! Thanks for the fixes! See 2 comments below.
> 
>>>> +		struct index_def *pk = pUseIndex;
>>>>  		assert(pk != NULL);
>>>>  -		uint32_t fieldno = pk->def->key_def->parts[0].fieldno;
>>>> +		uint32_t fieldno = pk->key_def->parts[0].fieldno;
>>>>  		enum affinity_type affinity =
>>>>  			tab->def->fields[fieldno].affinity;
>>>> -		if (pk->def->key_def->part_count == 1 &&
>>>> +		if (pk->key_def->part_count == 1 &&
>>>>  		    affinity == AFFINITY_INTEGER && (int)fieldno < nVector) {
>>>>  			int reg_pk = rLhs + (int)fieldno;
>>> 
>>> 3. Why do we still need this MustBeInt here? As I understand, here we
>>> are trying to use an index for binary search of an LHS, but under the
>>> hood it is just tarantool iterators starting from certain keys, it is
>>> not? If it is, then tarantool iterators does key validation for us. Can
>>> you please check if this opcode is redundant?
>> It doesn’t seem to be useless. Consider example:
>> CREATE TABLE t1 (id INT PRIMARY KEY);
>> SELECT 1.23 IN t1;
>> Now, OP_Affinity doesn’t cast REAL to INT - I described reason
>> in previous review. So, without OP_MustBeInt query above would
>> result in
>> Execution failed: Supplied key type of part 0 does not match index part type: expected integer
>> Since 1.23 is passed to iterator over integer field. But I guess query above
>> is fine - it should simply return 0, not an error since queries like
>> SELECT * FROM t1 WHERE id = 1.123;
>> SELECT * FROM t1 WHERE id > 1.123;
>> INSERT INTO t1 VALUES (1.123);
>> are allowed as well.
> 
> 1. Then I am confused. Look at OP_MustBeInt. It calls
> applyAffinity(pIn1, AFFINITY_NUMERIC). NUMERIC is real, so
> 1.23 cast to NUMERIC should be unchanged, it is not? I
> tried this:
> 
> tarantool> box.sql.execute("select CAST(1.23 as NUMERIC)")
> ---
> - - [1.23]
> ...
> 
> So 1.23 is already numeric. But applyAffinity function
> for unknown reason does this: for target affinity ==
> NUMERIC it sees that the original affinity is real and
> casts the result to integer. So
> applyAffinity(1.23, NUMERIC) == 1, but
> SELECT CAST 1.23 as NUMERIC == 1.23.   ?!

Sorry for making you confused, but this conversion takes
place in the next patch: there it is not applyAffinity(1.23, NUMERIC),
but mem_apply_affinity(1.23, INTEGER). So it seems to work.

>>>> diff --git a/test/sql-tap/boundary3.test.lua b/test/sql-tap/boundary3.test.lua
>>>> index 5b63e0539..c5d605b65 100755
>>>> --- a/test/sql-tap/boundary3.test.lua
>>>> +++ b/test/sql-tap/boundary3.test.lua
>>>> @@ -125,7 +125,7 @@ test:do_test(
>>>>          -- </boundary3-1.3>
>>>>      })
>>>>  -test:do_execsql_test(
>>>> +--[[test:do_execsql_test(
>>> 
>>> 4. Do not silently comment tests. We should either delete it, or fix, or
>>> open an issue to fix it later.
>> These tests are working. In previous review you asked to enable tests which
>> are under ‘if false’, so I forgot about simply commented tests.
> 
> 2. Sorry, in3.test.lua still has a commented test. orderby5.test.lua too.

Overall, I’ve scrolled over patch again and tried to eliminate all stray diffs
and accidentally commented tests. Diff is really huge, I hope I didn’t miss
smth else again.

I’ve deleted commented test in orderby5.test.lua since it doesn’t really
make any sense. Yep, query plan has changed (a lot of queries have
changed their plans after types were specified for tables). But it is still
the same if I create table with types on current 2.1.

diff --git a/test/sql-tap/orderby5.test.lua b/test/sql-tap/orderby5.test.lua
index 8a423843a..048f024ab 100755
--- a/test/sql-tap/orderby5.test.lua
+++ b/test/sql-tap/orderby5.test.lua
@@ -179,20 +179,6 @@ test:do_execsql_test(
 --   EXPLAIN QUERY PLAN
 --   SELECT * FROM t1 WHERE a=0 ORDER BY c, b, a;
 -- } {/B-TREE/}
---test:do_execsql_test(
---    3.0,
---    [[
---        CREATE TABLE t3(a INTEGER PRIMARY KEY, b INT, c INT, d INT, e INT, f INT);
-        --CREATE INDEX t3bcde ON t3(b, c, d, e);
-        -- As pk is not necessary in Tarantool's secondary indexes 'a' should be added manually
---        CREATE INDEX t3bcde ON t3(b, c, d, e, a);
---        EXPLAIN QUERY PLAN
---        SELECT a FROM t3 WHERE b=2 AND c=3 ORDER BY d DESC, e DESC, b, c, a DESC;
---    ]], {
-        -- <3.0>
---        "~/B-TREE/"
-        -- </3.0>
---    })
 
For in3.test.lua I’ve uncommented the rest of tests. However, I’ve
found a bug connected with IN operator: if query in form of

SELECT * FROM t1 WHERE X IN (SELECT Y FROM t2);

and PK of Y is of type INT, then additional OP_MustBeINT opcode
is generated. But instead of PK, it is applied to column Y, which might
not be forced to INT. Hence, I slightly refactored code for IN operator
and removed generation of OP_MustBeINT at all. Instead, I allowed
using index search only if types match exactly (what I think is the only
right way). Also, I’ve added explanation comment for each changed
test in in3.test.lua

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 270a6664c..3dd7fb01a 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -2404,23 +2404,12 @@ sqlite3FindInIndex(Parse * pParse,      /* Parsing context */
                        char idxaff =
                                sqlite3TableColumnAffinity(pTab->def, iCol);
                        enum affinity_type lhs_aff = sqlite3ExprAffinity(pLhs);
-                       char cmpaff = sql_affinity_result(lhs_aff, idxaff);
-                       testcase(cmpaff == AFFINITY_BLOB);
-                       testcase(cmpaff == AFFINITY_TEXT);
-                       switch (cmpaff) {
-                       case AFFINITY_BLOB:
-                               break;
-                       case AFFINITY_TEXT:
-                               /* sql_affinity_result() only returns TEXT if one side or the
-                                * other has no affinity and the other side is TEXT.  Hence,
-                                * the only way for cmpaff to be TEXT is for idxaff to be TEXT
-                                * and for the term on the LHS of the IN to have no affinity.
-                                */
-                               assert(idxaff == AFFINITY_TEXT);
-                               break;
-                       default:
-                               affinity_ok = sqlite3IsNumericAffinity(idxaff);
-                       }
+                       /*
+                        * Index search is possible only if types
+                        * of columns match.
+                        */
+                       if (lhs_aff != idxaff)
+                               affinity_ok = 0;
                }
 
                if (affinity_ok) {
@@ -3148,25 +3137,6 @@ sqlite3ExprCodeIN(Parse * pParse,        /* Parsing and code generating context */
         * of the RHS using the LHS as a probe.  If found, the result is
         * true.
         */
+       sqlite3VdbeAddOp4(v, OP_Affinity, rLhs, nVector, 0, zAff, nVector);
-       if ((pExpr->flags & EP_xIsSelect)
-           && !pExpr->is_ephemeral) {
-               struct SrcList *src_list = pExpr->x.pSelect->pSrc;
-               assert(src_list->nSrc == 1);
-
-               struct Table *tab = src_list->a[0].pTab;
-               assert(tab != NULL);
-               struct index *pk = space_index(tab->space, 0);
-               assert(pk != NULL);
-
-               uint32_t fieldno = pk->def->key_def->parts[0].fieldno;
-               enum affinity_type affinity =
-                       tab->def->fields[fieldno].affinity;
-               if (pk->def->key_def->part_count == 1 &&
-                   affinity == AFFINITY_INTEGER && (int)fieldno < nVector) {
-                       int reg_pk = rLhs + (int)fieldno;
-                       sqlite3VdbeAddOp2(v, OP_MustBeInt, reg_pk, destIfFalse);
-               }
-       }

diff --git a/test/sql-tap/in3.test.lua b/test/sql-tap/in3.test.lua
index d38907c37..6d751dc52 100755
--- a/test/sql-tap/in3.test.lua
+++ b/test/sql-tap/in3.test.lua
@@ -1,6 +1,6 @@
 #!/usr/bin/env tarantool
 test = require("sqltester")
-test:plan(22)
+test:plan(27)
 
 --!./tcltestrunner.lua
 -- 2007 November 29
@@ -92,7 +92,7 @@ test:do_test(
         return exec_neph(" SELECT a FROM t1 WHERE a+0 IN (SELECT a FROM t1); ")
     end, {
         -- <in3-1.5>
-        0, 1, 3, 5
+        1, 1, 3, 5
         -- </in3-1.5>
     })
 
@@ -293,39 +293,39 @@ test:do_test(
 test:do_test(
     "in3-3.2",
     function()
-        -- No affinity is applied before comparing "x" and "a". Therefore
-        -- the index can be used (the comparison is false, text!=number).
+        -- Both columns have type 'BLOB' so they are comparable.
+        -- Moreover, we can use index and avoid materializing
+        -- retults into ephemeral table.
         return exec_neph(" SELECT x IN (SELECT a FROM t1) FROM t2 ")
     end, {
         -- <in3-3.2>
-        0, 0
+        0, 1
         -- </in3-3.2>
     })
 
-if false then
 test:do_test(
     "in3-3.3",
     function()
-        -- Logically, numeric affinity is applied to both sides before 
-        -- the comparison.  Therefore it is possible to use index t1_i2.
+        -- the comparison, but index can't be used.
         return exec_neph(" SELECT x IN (SELECT b FROM t1) FROM t2 ")
     end, {
         -- <in3-3.3>
-        0, 1
+        1, 1
         -- </in3-3.3>
     })
 
---[[test:do_test(
+test:do_test(
     "in3-3.4",
     function()
-        -- No affinity is applied before the comparison takes place. Making
-        -- it possible to use index t1_i3.
+        -- BLOB is compatible with TEXT, however index can't
+        -- be used since under the hood BLOB is SCALAR (which
+        -- can contain not only STRING values) and TEXT is STRING.
         return exec_neph(" SELECT x IN (SELECT c FROM t1) FROM t2 ")
     end, {
         -- <in3-3.4>
-        0, 1
+        1, 1
         -- </in3-3.4>
-    })]]
+    })
 
 test:do_test(
     "in3-3.5",
@@ -362,7 +362,6 @@ test:do_test(
         1, 1
         -- </in3-3.7>
     })
-end
 -----------------------------------------------------------------------
 --
 -- Test using a multi-column index.
@@ -395,7 +394,7 @@ test:do_test(
         return exec_neph(" SELECT 'text' IN (SELECT b FROM t3)")
     end, {
         -- <in3-4.2>
-        0, 0
+        0, 1
         -- </in3-4.2>
     })