[tarantool-patches] Re: [PATCH 3/3] sql: fix passing FP values to integer iterator
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Tue May 28 00:49:14 MSK 2019
Thanks for the patch! See 7 comments below.
1. Double whitespace in the commit title before 'FP'.
2. The patch does not work with unsigned numbers.
tarantool> format = {}
---
...
tarantool> format[1] = {name = 'ID', type = 'unsigned'}
---
...
tarantool> format[2] = {name = 'A', type = 'unsigned'}
---
...
tarantool> s = box.schema.create_space('T1', {format = format})
---
...
tarantool> pk = s:create_index('pk')
---
...
tarantool> sk = s:create_index('sk', {parts = {'A'}})
---
...
tarantool> box.execute("INSERT INTO t1 VALUES (1, 1);")
---
- row_count: 1
...
tarantool> box.execute("SELECT a FROM t1 WHERE a IN (1.1, 2.1);")
2019-05-28 00:34:19.922 [1452] main/102/interactive key_def.h:488 E> ER_KEY_PART_TYPE: Supplied key type of part 0 does not match index part type: expected unsigned
---
- error: 'Failed to execute SQL statement: Supplied key type of part 0 does not match
index part type: expected unsigned'
...
On 24/05/2019 20:39, Nikita Pettik wrote:
> Before this patch it was impossible to compare indexed field of integer
> type and floating point value. For instance:
>
> CREATE TABLE t1(id INT PRIMARY KEY, a INT UNIQUE);
> INSERT INTO t1 VALUES (1, 1);
> SELECT * FROM t1 WHERE a = 1.5;
> ---
> - error: 'Failed to execute SQL statement: Supplied key type of part 0 does not match
> index part type: expected integer'
> ...
>
> That happened due to the fact that type casting mechanism (OP_ApplyType)
> doesn't affect FP value when it is converted to integer. Hence, FP value
> was passed to the iterator over integer field which resulted in error.
> Meanwhile, comparison of integer and FP values is legal in SQL. To cope
> with this problem for each equality comparison involving integer field
> we emit OP_MustBeInt, which checks whether value to be compared is
> integer or not. If the latter, we assume that result of comparison is
> always false and continue processing query. For inequality constraints
> we pass auxiliary flag to OP_Seek** opcodes to notify it that one of key
> fields must be truncated to integer (in case of FP value) alongside with
> changing iterator's type: a > 1.5 -> a >= 2.
>
> Closes #4187
> ---
> src/box/sql/vdbe.c | 23 +++++++++------
> src/box/sql/wherecode.c | 76 +++++++++++++++++++++++++++++++++++--------------
> test/sql/types.result | 40 ++++++++++++++++++++++++++
> test/sql/types.test.lua | 9 ++++++
> 4 files changed, 118 insertions(+), 30 deletions(-)
>
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index 3bd82234e..3ead6b40f 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -3496,15 +3495,22 @@ case OP_SeekGT: { /* jump, in3 */
> pC->seekOp = pOp->opcode;
> #endif
> iKey = 0;
> - reg_ipk = pOp->p5;
> -
> - if (reg_ipk > 0) {
> + /*
> + * In case floating value is intended to be passed to
> + * iterator over integer field, we must truncate it to
> + * integer value and change type of iterator:
> + * a > 1.5 -> a >= 2
> + */
> + int int_field = pOp->p5;
3. Comments on P5 are now outdated.
>
> + if (int_field > 0) {
> /* The input value in P3 might be of any type: integer, real, string,
> * blob, or NULL. But it needs to be an integer before we can do
> * the seek, so convert it.
> */
> - pIn3 = &aMem[reg_ipk];
> + pIn3 = &aMem[int_field];
> + if ((pIn3->flags & MEM_Null) != 0)
> + goto skip_truncate;
> if ((pIn3->flags & (MEM_Int|MEM_Real|MEM_Str))==MEM_Str) {
> mem_apply_numeric_type(pIn3);
> }
> diff --git a/src/box/sql/wherecode.c b/src/box/sql/wherecode.c
> index 977c0fced..e779729c7 100644
> --- a/src/box/sql/wherecode.c
> +++ b/src/box/sql/wherecode.c
> @@ -1086,31 +1086,61 @@ sqlWhereCodeOneLoopStart(WhereInfo * pWInfo, /* Complete information about the W
> start_constraints = 1;
> }
> struct index_def *idx_pk = space->index[0]->def;
> - int fieldno = idx_pk->key_def->parts[0].fieldno;
> -
> uint32_t pk_part_count = idx_pk->key_def->part_count;
> - if (pk_part_count == 1 &&
> - space->def->fields[fieldno].type == FIELD_TYPE_INTEGER) {
> - /* Right now INTEGER PRIMARY KEY is the only option to
> - * get Tarantool's INTEGER column type. Need special handling
> - * here: try to loosely convert FLOAT to INT. If RHS type
> - * is not INT or FLOAT - skip this ites, i.e. goto addrNxt.
> - */
> - int limit = pRangeStart == NULL ? nEq : nEq + 1;
> - for (int i = 0; i < limit; i++) {
> - if (idx_def->key_def->parts[i].fieldno ==
> - idx_pk->key_def->parts[0].fieldno) {
> - /* Here: we know for sure that table has INTEGER
> - PRIMARY KEY, single column, and Index we're
> - trying to use for scan contains this column. */
> - if (i < nEq)
> - sqlVdbeAddOp2(v, OP_MustBeInt, regBase + i, addrNxt);
> - else
> - force_integer_reg = regBase + i;
> - break;
> - }
> + /*
> + * Tarantool's iterator over integer fields doesn't
> + * tolerate floating point values. Hence, if term
> + * is equality comparison and value of operand is
> + * not integer, we can skip it since it always
> + * results in false: INT a == 0.5 -> false;
> + * It is done using OP_MustBeInt facilities.
> + * In case term is greater comparison (a > ?), we
> + * should notify OP_SeekGT to process truncation of
> + * floating point value: a > 0.5 -> a >= 1;
> + * It is done by setting P5 flag for OP_Seek*.
> + * It is worth mentioning that we do not need
> + * this step when it comes for less comparison:
> + * key is NULL in this case: values are ordered
> + * as NULL, ... NULL, min_value, ..., so to
> + * fetch min value we pass NULL to GT iterator.
4. I did not get the 'less comparator == compare with NULL'
idea. Firstly you describe how to compare 'a > ?', this is
fully understandable. Then you, as I see, go to the 'a < ?'
case, but somewhy say, that it is the same as 'NULL < ?'.
Do not understand.
> + * The only exception is less comparison in
> + * conjunction with ORDER BY DESC clause:
> + * in such situation we use LE iterator and
> + * truncated value to compare. But then
> + * pRangeStart == NULL.
> + * This procedure is correct for compound index:
> + * only one comparison of less/greater type can be
> + * used at the same time. For instance,
> + * a < 1.5 AND b > 0.5 is handled by SeekGT using
> + * column a and fetching column b from tuple and
> + * OP_Le comparison.
> + *
> + * Note that OP_ApplyType, which is emitted before
> + * OP_Seek** doesn't truncate floating point to
> + * integer. That's why we need this routine.
5. Why can't OP_ApplyType truncate the numbers? Because it does
not know, to which side round the values? Then probably we can
drop OP_ApplyType invocations?
> + * Also, note that terms are separated by OR
> + * predicates, so we consider term as sequence
> + * of AND'ed predicates.
> + */
> + int seek_addr = 0;
> + for (int i = 0; i < nEq; i++) {
> + enum field_type type = idx_def->key_def->parts[i].type;
> + if (type == FIELD_TYPE_INTEGER) {
> + /*
> + * OP_MustBeInt consider NULLs as
> + * non-integer values, so firstly
> + * check whether value is NULL or not.
> + */
> + seek_addr = sqlVdbeAddOp1(v, OP_IsNull,
> + regBase);
6. I do not understand, how does it work. You rewrite
seek_addr in the cycle. It means, that if there are 2 integers,
the first OP_IsNull address is rewritten by the second, and
below, in sqlVdbeJumpHere(v, seek_addr), you will update P2
for the last comparison only. The first OP_IsNull.P2 will stay 0.
> + sqlVdbeAddOp2(v, OP_MustBeInt, regBase + i,
> + addrNxt);
> }
> }
> + /* Inequality constraint comes always at the end of list. */
> + if (pRangeStart != NULL &&
> + idx_def->key_def->parts[nEq].type == FIELD_TYPE_INTEGER)
> + force_integer_reg = regBase + nEq;
> emit_apply_type(pParse, regBase, nConstraint - bSeekPastNull,
> start_types);
> if (pLoop->nSkip > 0 && nConstraint == pLoop->nSkip) {
> @@ -1122,6 +1152,8 @@ sqlWhereCodeOneLoopStart(WhereInfo * pWInfo, /* Complete information about the W
> op = aStartOp[(start_constraints << 2) +
> (startEq << 1) + bRev];
> assert(op != 0);
> + if (seek_addr != 0)
> + sqlVdbeJumpHere(v, seek_addr);
> sqlVdbeAddOp4Int(v, op, iIdxCur, addrNxt, regBase,
> nConstraint);
> /* If this is Seek* opcode, and IPK is detected in the
> diff --git a/test/sql/types.result b/test/sql/types.result
> index 758018eb5..274d73dd2 100644
> --- a/test/sql/types.result
> +++ b/test/sql/types.result
> @@ -927,3 +927,43 @@ box.execute("SELECT * FROM tboolean WHERE s1 = 1.123;")
> box.space.TBOOLEAN:drop()
> ---
> ...
> +box.execute("CREATE TABLE t1(id INT PRIMARY KEY, a INT UNIQUE);")
> +---
> +- row_count: 1
> +...
> +box.execute("INSERT INTO t1 VALUES (1, 1);")
> +---
> +- row_count: 1
> +...
> +box.execute("SELECT a FROM t1 WHERE a IN (1.1, 2.1);")
> +---
> +- metadata:
> + - name: A
> + type: integer
> + rows: []
> +...
> +box.execute("SELECT a FROM t1 WHERE a = 1.1;")
> +---
> +- metadata:
> + - name: A
> + type: integer
> + rows: []
> +...
> +box.execute("SELECT a FROM t1 WHERE a > 1.1;")
> +---
> +- metadata:
> + - name: A
> + type: integer
> + rows: []
> +...
> +box.execute("SELECT a FROM t1 WHERE a < 1.1;")
> +---
> +- metadata:
> + - name: A
> + type: integer
> + rows:
> + - [1]
7. Please, add the Kostja's example about '= 1.0'.
> +...
> +box.space.T1:drop()
> +---
> +...
More information about the Tarantool-patches
mailing list