From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 0C0202BE98 for ; Mon, 27 May 2019 17:49:18 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OwoqwmtRqgUs for ; Mon, 27 May 2019 17:49:17 -0400 (EDT) Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 364C82BE85 for ; Mon, 27 May 2019 17:49:17 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 3/3] sql: fix passing FP values to integer iterator References: <789091b7acd99c908d26689f27c55f8b6dba3d16.1558700151.git.korablev@tarantool.org> From: Vladislav Shpilevoy Message-ID: Date: Tue, 28 May 2019 00:49:14 +0300 MIME-Version: 1.0 In-Reply-To: <789091b7acd99c908d26689f27c55f8b6dba3d16.1558700151.git.korablev@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org, Nikita Pettik Thanks for the patch! See 7 comments below. 1. Double whitespace in the commit title before 'FP'. 2. The patch does not work with unsigned numbers. tarantool> format = {} --- ... tarantool> format[1] = {name = 'ID', type = 'unsigned'} --- ... tarantool> format[2] = {name = 'A', type = 'unsigned'} --- ... tarantool> s = box.schema.create_space('T1', {format = format}) --- ... tarantool> pk = s:create_index('pk') --- ... tarantool> sk = s:create_index('sk', {parts = {'A'}}) --- ... tarantool> box.execute("INSERT INTO t1 VALUES (1, 1);") --- - row_count: 1 ... tarantool> box.execute("SELECT a FROM t1 WHERE a IN (1.1, 2.1);") 2019-05-28 00:34:19.922 [1452] main/102/interactive key_def.h:488 E> ER_KEY_PART_TYPE: Supplied key type of part 0 does not match index part type: expected unsigned --- - error: 'Failed to execute SQL statement: Supplied key type of part 0 does not match index part type: expected unsigned' ... On 24/05/2019 20:39, Nikita Pettik wrote: > Before this patch it was impossible to compare indexed field of integer > type and floating point value. For instance: > > CREATE TABLE t1(id INT PRIMARY KEY, a INT UNIQUE); > INSERT INTO t1 VALUES (1, 1); > SELECT * FROM t1 WHERE a = 1.5; > --- > - error: 'Failed to execute SQL statement: Supplied key type of part 0 does not match > index part type: expected integer' > ... > > That happened due to the fact that type casting mechanism (OP_ApplyType) > doesn't affect FP value when it is converted to integer. Hence, FP value > was passed to the iterator over integer field which resulted in error. > Meanwhile, comparison of integer and FP values is legal in SQL. To cope > with this problem for each equality comparison involving integer field > we emit OP_MustBeInt, which checks whether value to be compared is > integer or not. If the latter, we assume that result of comparison is > always false and continue processing query. For inequality constraints > we pass auxiliary flag to OP_Seek** opcodes to notify it that one of key > fields must be truncated to integer (in case of FP value) alongside with > changing iterator's type: a > 1.5 -> a >= 2. > > Closes #4187 > --- > src/box/sql/vdbe.c | 23 +++++++++------ > src/box/sql/wherecode.c | 76 +++++++++++++++++++++++++++++++++++-------------- > test/sql/types.result | 40 ++++++++++++++++++++++++++ > test/sql/types.test.lua | 9 ++++++ > 4 files changed, 118 insertions(+), 30 deletions(-) > > diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c > index 3bd82234e..3ead6b40f 100644 > --- a/src/box/sql/vdbe.c > +++ b/src/box/sql/vdbe.c > @@ -3496,15 +3495,22 @@ case OP_SeekGT: { /* jump, in3 */ > pC->seekOp = pOp->opcode; > #endif > iKey = 0; > - reg_ipk = pOp->p5; > - > - if (reg_ipk > 0) { > + /* > + * In case floating value is intended to be passed to > + * iterator over integer field, we must truncate it to > + * integer value and change type of iterator: > + * a > 1.5 -> a >= 2 > + */ > + int int_field = pOp->p5; 3. Comments on P5 are now outdated. > > + if (int_field > 0) { > /* The input value in P3 might be of any type: integer, real, string, > * blob, or NULL. But it needs to be an integer before we can do > * the seek, so convert it. > */ > - pIn3 = &aMem[reg_ipk]; > + pIn3 = &aMem[int_field]; > + if ((pIn3->flags & MEM_Null) != 0) > + goto skip_truncate; > if ((pIn3->flags & (MEM_Int|MEM_Real|MEM_Str))==MEM_Str) { > mem_apply_numeric_type(pIn3); > } > diff --git a/src/box/sql/wherecode.c b/src/box/sql/wherecode.c > index 977c0fced..e779729c7 100644 > --- a/src/box/sql/wherecode.c > +++ b/src/box/sql/wherecode.c > @@ -1086,31 +1086,61 @@ sqlWhereCodeOneLoopStart(WhereInfo * pWInfo, /* Complete information about the W > start_constraints = 1; > } > struct index_def *idx_pk = space->index[0]->def; > - int fieldno = idx_pk->key_def->parts[0].fieldno; > - > uint32_t pk_part_count = idx_pk->key_def->part_count; > - if (pk_part_count == 1 && > - space->def->fields[fieldno].type == FIELD_TYPE_INTEGER) { > - /* Right now INTEGER PRIMARY KEY is the only option to > - * get Tarantool's INTEGER column type. Need special handling > - * here: try to loosely convert FLOAT to INT. If RHS type > - * is not INT or FLOAT - skip this ites, i.e. goto addrNxt. > - */ > - int limit = pRangeStart == NULL ? nEq : nEq + 1; > - for (int i = 0; i < limit; i++) { > - if (idx_def->key_def->parts[i].fieldno == > - idx_pk->key_def->parts[0].fieldno) { > - /* Here: we know for sure that table has INTEGER > - PRIMARY KEY, single column, and Index we're > - trying to use for scan contains this column. */ > - if (i < nEq) > - sqlVdbeAddOp2(v, OP_MustBeInt, regBase + i, addrNxt); > - else > - force_integer_reg = regBase + i; > - break; > - } > + /* > + * Tarantool's iterator over integer fields doesn't > + * tolerate floating point values. Hence, if term > + * is equality comparison and value of operand is > + * not integer, we can skip it since it always > + * results in false: INT a == 0.5 -> false; > + * It is done using OP_MustBeInt facilities. > + * In case term is greater comparison (a > ?), we > + * should notify OP_SeekGT to process truncation of > + * floating point value: a > 0.5 -> a >= 1; > + * It is done by setting P5 flag for OP_Seek*. > + * It is worth mentioning that we do not need > + * this step when it comes for less comparison: > + * key is NULL in this case: values are ordered > + * as NULL, ... NULL, min_value, ..., so to > + * fetch min value we pass NULL to GT iterator. 4. I did not get the 'less comparator == compare with NULL' idea. Firstly you describe how to compare 'a > ?', this is fully understandable. Then you, as I see, go to the 'a < ?' case, but somewhy say, that it is the same as 'NULL < ?'. Do not understand. > + * The only exception is less comparison in > + * conjunction with ORDER BY DESC clause: > + * in such situation we use LE iterator and > + * truncated value to compare. But then > + * pRangeStart == NULL. > + * This procedure is correct for compound index: > + * only one comparison of less/greater type can be > + * used at the same time. For instance, > + * a < 1.5 AND b > 0.5 is handled by SeekGT using > + * column a and fetching column b from tuple and > + * OP_Le comparison. > + * > + * Note that OP_ApplyType, which is emitted before > + * OP_Seek** doesn't truncate floating point to > + * integer. That's why we need this routine. 5. Why can't OP_ApplyType truncate the numbers? Because it does not know, to which side round the values? Then probably we can drop OP_ApplyType invocations? > + * Also, note that terms are separated by OR > + * predicates, so we consider term as sequence > + * of AND'ed predicates. > + */ > + int seek_addr = 0; > + for (int i = 0; i < nEq; i++) { > + enum field_type type = idx_def->key_def->parts[i].type; > + if (type == FIELD_TYPE_INTEGER) { > + /* > + * OP_MustBeInt consider NULLs as > + * non-integer values, so firstly > + * check whether value is NULL or not. > + */ > + seek_addr = sqlVdbeAddOp1(v, OP_IsNull, > + regBase); 6. I do not understand, how does it work. You rewrite seek_addr in the cycle. It means, that if there are 2 integers, the first OP_IsNull address is rewritten by the second, and below, in sqlVdbeJumpHere(v, seek_addr), you will update P2 for the last comparison only. The first OP_IsNull.P2 will stay 0. > + sqlVdbeAddOp2(v, OP_MustBeInt, regBase + i, > + addrNxt); > } > } > + /* Inequality constraint comes always at the end of list. */ > + if (pRangeStart != NULL && > + idx_def->key_def->parts[nEq].type == FIELD_TYPE_INTEGER) > + force_integer_reg = regBase + nEq; > emit_apply_type(pParse, regBase, nConstraint - bSeekPastNull, > start_types); > if (pLoop->nSkip > 0 && nConstraint == pLoop->nSkip) { > @@ -1122,6 +1152,8 @@ sqlWhereCodeOneLoopStart(WhereInfo * pWInfo, /* Complete information about the W > op = aStartOp[(start_constraints << 2) + > (startEq << 1) + bRev]; > assert(op != 0); > + if (seek_addr != 0) > + sqlVdbeJumpHere(v, seek_addr); > sqlVdbeAddOp4Int(v, op, iIdxCur, addrNxt, regBase, > nConstraint); > /* If this is Seek* opcode, and IPK is detected in the > diff --git a/test/sql/types.result b/test/sql/types.result > index 758018eb5..274d73dd2 100644 > --- a/test/sql/types.result > +++ b/test/sql/types.result > @@ -927,3 +927,43 @@ box.execute("SELECT * FROM tboolean WHERE s1 = 1.123;") > box.space.TBOOLEAN:drop() > --- > ... > +box.execute("CREATE TABLE t1(id INT PRIMARY KEY, a INT UNIQUE);") > +--- > +- row_count: 1 > +... > +box.execute("INSERT INTO t1 VALUES (1, 1);") > +--- > +- row_count: 1 > +... > +box.execute("SELECT a FROM t1 WHERE a IN (1.1, 2.1);") > +--- > +- metadata: > + - name: A > + type: integer > + rows: [] > +... > +box.execute("SELECT a FROM t1 WHERE a = 1.1;") > +--- > +- metadata: > + - name: A > + type: integer > + rows: [] > +... > +box.execute("SELECT a FROM t1 WHERE a > 1.1;") > +--- > +- metadata: > + - name: A > + type: integer > + rows: [] > +... > +box.execute("SELECT a FROM t1 WHERE a < 1.1;") > +--- > +- metadata: > + - name: A > + type: integer > + rows: > + - [1] 7. Please, add the Kostja's example about '= 1.0'. > +... > +box.space.T1:drop() > +--- > +...