From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 819BC29F82 for ; Tue, 23 Apr 2019 15:59:05 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4xlrw_XQvvkM for ; Tue, 23 Apr 2019 15:59:05 -0400 (EDT) Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 1AF82290EA for ; Tue, 23 Apr 2019 15:59:05 -0400 (EDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: [tarantool-patches] Re: [PATCH 9/9] sql: make accept only boolean From: "n.pettik" In-Reply-To: Date: Tue, 23 Apr 2019 22:59:02 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <09F94053-CCE0-49FD-8788-3F0749BF7ED8@tarantool.org> References: <103cef3d31c59d1b869a7675a01ed2e6279a47ef.1555252410.git.korablev@tarantool.org> <6DC565F1-23AA-49CA-BB2C-AA60EE9E3593@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy > On 18/04/2019 20:55, n.pettik wrote: >>=20 >>>> is a predicate used as a part of WHERE and >>>> JOIN clauses. ANSI SQL states that must >>>> accept only boolean arguments. In our SQL it is implemented as >>>> bytecode instruction OP_If which in turn carries out logic of >>>> conditional jump. Since it can be involved in executing other = routines >>>> different from ,=20 >>>=20 >>> 1. Which other routines? What is a valid case of OP_If with = non-boolean >>> value in check? >>=20 >> For instance, to verify that register containing LIMIT value is > 0. >=20 > Yes, and this is almost the only case. What is more, it happens only = once > per request, to check if LIMIT =3D=3D 0 initially. Further it is = decremented > and checked via OP_IfNotZero and OP_DecrJumpZero. >=20 >> It is quite hard to track values which come to this opcode, so we >> can=E2=80=99t be sure that it always accepts booleans. >=20 > It is hard, but without it >=20 > 1) You can't be sure, that really all the search conditions > are checked to be booleans; >=20 > 2) It makes OP_If/IfNot slower, and they are called repeatedly in > requests; One branching worth nothing. Below you suggest to split opcode into two (fix me if I=E2=80=99m wrong), which in turn affects = performance way much more. > 3) It adds one more flag SQL_BOOLREQ, which looks very crutchy. IMHO it is matter of taste. Anyway, removed this flag. > I tried myself to disable non-bool for OP_If/IfNot, and I fixed > LIMIT cases. Then I've found CASE-WHEN also uses OP_If/IfNot in > this distorted way. >=20 > tarantool> box.execute("SELECT CASE WHEN 0 THEN 'zero' WHEN 1 THEN = 'one' END;=E2=80=9D) Fixed. > It violates the standard. "Information technology =E2=80=94 > Database languages =E2=80=94 SQL =E2=80=94 Part 2: Foundation = (SQL/Foundation)", > 2011, page 230. >=20 > 'WHEN' is a search condition, but I've used '1', not 'true'. > Also I tested it on PostgreSQL - they raise an error, so it is > both standard and practically used way. >=20 > Below are my fixes for LIMIT and a small obvious refactoring, > but they are *not on the branch* - not all the tests pass when I > start banning non-bools in OP_If/IfNot. I=E2=80=99ve fixed that. > And I do not like these temporary registers. They seem OK to me. > Probably we could somehow reuse OP_IfNotZero, > for example, via having it refactored to OP_CheckZero which would > be able to jump both if the target is zero and if not. For example, > it could look like this: >=20 > /** > * P1 - a register with value to check if it equals zero. > * P3 - if 1, then check r[P1] =3D=3D 0, if 0, then check r[P1] !=3D = 0. > * If the check is successful, then jump to P2. > *=20 > * if (r[P1] =3D=3D 0) =3D=3D P3 then jump p2 > */ >=20 > But I do not know what to do with its current version decrementing > r[P1]. My ideas were about splitting 'decrement' operation from > zero-checking. Then we could do these replacements: >=20 > OP_IfNotZero -> OP_CheckZero, OP_Decrement >=20 > OP_DecrJumpZero -> OP_Decrement, OP_CheckZero >=20 > OP_IfNot -> OP_CheckZero (where it is used inappropriately) >=20 > And now we drop OP_IfNotZero and OP_DecrJumpZero, and forbid > non-bool in OP_IfNot/OP_If. Perhaps we will find more standard > violations. >=20 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D I=E2=80=99ve managed to make OP_If accept only booleans without = splitting opcodes. Diff (including your changes): diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c index 3c4cde2c8..86fc28606 100644 --- a/src/box/sql/expr.c +++ b/src/box/sql/expr.c @@ -1094,24 +1094,16 @@ sqlPExprAddSelect(Parse * pParse, Expr * pExpr, = Select * pSelect) * LEFT JOIN, then we cannot determine at compile-time whether or not * is it true or false, so always return 0. */ -static int +static inline bool exprAlwaysTrue(Expr * p) { - if (ExprHasProperty(p, EP_FromJoin)) - return 0; - if (p->op =3D=3D TK_TRUE) - return 1; - return 0; + return !ExprHasProperty(p, EP_FromJoin) && p->op =3D=3D TK_TRUE; } =20 -static int +static inline bool exprAlwaysFalse(Expr * p) { - if (ExprHasProperty(p, EP_FromJoin)) - return 0; - if (p->op =3D=3D TK_FALSE) - return 1; - return 0; + return !ExprHasProperty(p, EP_FromJoin) && p->op =3D=3D = TK_FALSE; } =20 struct Expr * diff --git a/src/box/sql/select.c b/src/box/sql/select.c index 428e3a92f..280391040 100644 --- a/src/box/sql/select.c +++ b/src/box/sql/select.c @@ -769,8 +769,10 @@ pushOntoSorter(Parse * pParse, /* = Parser context */ pParse->nMem +=3D pSort->nOBSat; nKey =3D nExpr - pSort->nOBSat + bSeq; if (bSeq) { - addrFirst =3D - sqlVdbeAddOp1(v, OP_IfNot, regBase + nExpr); + int r1 =3D sqlGetTempReg(pParse); + sqlVdbeAddOp2(v, OP_Integer, 0, r1); + addrFirst =3D sqlVdbeAddOp3(v, OP_Eq, r1, 0, = regBase + nExpr); + sqlReleaseTempReg(pParse, r1); } else { addrFirst =3D sqlVdbeAddOp1(v, OP_SequenceTest, @@ -799,9 +801,10 @@ pushOntoSorter(Parse * pParse, /* = Parser context */ pSort->labelBkOut); sqlVdbeAddOp1(v, OP_ResetSorter, pSort->iECursor); if (iLimit) { - sqlVdbeAddOp2(v, OP_IfNot, iLimit, - pSort->labelDone); - VdbeCoverage(v); + int r1 =3D sqlGetTempReg(pParse); + sqlVdbeAddOp2(v, OP_Integer, 0, r1); + sqlVdbeAddOp3(v, OP_Eq, r1, pSort->labelDone, = iLimit); + sqlReleaseTempReg(pParse, r1); } sqlVdbeJumpHere(v, addrFirst); sqlExprCodeMove(pParse, regBase, regPrevKey, = pSort->nOBSat); @@ -2115,11 +2118,10 @@ computeLimitRegisters(Parse * pParse, Select * = p, int iBreak) P4_STATIC); =20 sqlVdbeResolveLabel(v, positive_limit_label); - sqlReleaseTempReg(pParse, r1); VdbeCoverage(v); VdbeComment((v, "LIMIT counter")); - sqlVdbeAddOp2(v, OP_IfNot, iLimit, iBreak); - VdbeCoverage(v); + sqlVdbeAddOp3(v, OP_Eq, r1, iBreak, iLimit); + sqlReleaseTempReg(pParse, r1); =20 if ((p->selFlags & SF_SingleRow) !=3D 0) { if (ExprHasProperty(p->pLimit, EP_System)) { @@ -2673,10 +2675,11 @@ multiSelect(Parse * pParse, /* Parsing = context */ p->iLimit =3D pPrior->iLimit; p->iOffset =3D pPrior->iOffset; if (p->iLimit) { - addr =3D - sqlVdbeAddOp1(v, OP_IfNot, - = p->iLimit); - VdbeCoverage(v); + int r1 =3D = sqlGetTempReg(pParse); + sqlVdbeAddOp2(v, OP_Integer, 0, = r1); + addr =3D sqlVdbeAddOp3(v, OP_Eq, = r1, 0, + p->iLimit); + sqlReleaseTempReg(pParse, r1); VdbeComment((v, "Jump ahead if = LIMIT reached")); if (p->iOffset) { @@ -3047,7 +3050,7 @@ generateOutputSubroutine(struct Parse *parse, = struct Select *p, sqlVdbeJumpHere(v, addr1); sqlVdbeAddOp3(v, OP_Copy, in->iSdst, reg_prev + 1, in->nSdst - 1); - sqlVdbeAddOp2(v, OP_Integer, 1, reg_prev); + sqlVdbeAddOp2(v, OP_Bool, true, reg_prev); } if (parse->db->mallocFailed) return 0; @@ -3370,7 +3373,7 @@ multiSelectOrderBy(Parse * pParse, /* = Parsing context */ assert(nOrderBy >=3D expr_count || db->mallocFailed); regPrev =3D pParse->nMem + 1; pParse->nMem +=3D expr_count + 1; - sqlVdbeAddOp2(v, OP_Integer, 0, regPrev); + sqlVdbeAddOp2(v, OP_Bool, 0, regPrev); key_info_dup =3D sql_key_info_new(db, expr_count); if (key_info_dup !=3D NULL) { for (int i =3D 0; i < expr_count; i++) { diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c index 51d30ee1b..b22598b67 100644 --- a/src/box/sql/vdbe.c +++ b/src/box/sql/vdbe.c @@ -1721,8 +1721,8 @@ integer_overflow: * functions. * * If P1 is not zero, then it is a register that a subsequent min() or - * max() aggregate will set to 1 if the current row is not the minimum = or - * maximum. The P1 register is initialized to 0 by this instruction. + * max() aggregate will set to true if the current row is not the = minimum or + * maximum. The P1 register is initialized to false by this = instruction. * * The interface used by the implementation of the aforementioned = functions * to retrieve the collation sequence set by this opcode is not = available @@ -1731,7 +1731,7 @@ integer_overflow: case OP_CollSeq: { assert(pOp->p4type=3D=3DP4_COLLSEQ || pOp->p4.pColl =3D=3D = NULL); if (pOp->p1) { - sqlVdbeMemSetInt64(&aMem[pOp->p1], 0); + mem_set_bool(&aMem[pOp->p1], false); } break; } @@ -2563,21 +2563,10 @@ case OP_IfNot: { /* jump, in1 */ } else if ((pIn1->flags & MEM_Bool) !=3D 0) { c =3D pOp->opcode=3D=3DOP_IfNot ? ! pIn1->u.b : = pIn1->u.b; } else { - if (pOp->p5 =3D=3D OPFLAG_BOOLREQ) { - diag_set(ClientError, ER_SQL_TYPE_MISMATCH, - sql_value_text(pIn1), "boolean"); - rc =3D SQL_TARANTOOL_ERROR; - goto abort_due_to_error; - } - double v; - if (sqlVdbeRealValue(pIn1, &v) !=3D 0) { - diag_set(ClientError, ER_SQL_TYPE_MISMATCH, - sql_value_text(pIn1), "real"); - rc =3D SQL_TARANTOOL_ERROR; - goto abort_due_to_error; - } - c =3D v !=3D 0; - if (pOp->opcode=3D=3DOP_IfNot) c =3D !c; + diag_set(ClientError, ER_SQL_TYPE_MISMATCH, + sql_value_text(pIn1), "boolean"); + rc =3D SQL_TARANTOOL_ERROR; + goto abort_due_to_error; } VdbeBranchTaken(c!=3D0, 2); if (c) { @@ -5301,7 +5290,7 @@ case OP_AggStep: { if (pCtx->skipFlag) { assert(pOp[-1].opcode=3D=3DOP_CollSeq); i =3D pOp[-1].p1; - if (i) sqlVdbeMemSetInt64(&aMem[i], 1); + if (i) mem_set_bool(&aMem[i], true); } break; } diff --git a/test/sql-tap/cse.test.lua b/test/sql-tap/cse.test.lua index 39c1cc4ca..78d0d2046 100755 --- a/test/sql-tap/cse.test.lua +++ b/test/sql-tap/cse.test.lua @@ -102,7 +102,7 @@ test:do_execsql_test( test:do_execsql_test( "cse-1.6.3", [[ - SELECT CASE WHEN b THEN d WHEN e THEN f ELSE 999 END, b, c, d = FROM t1 + SELECT CASE WHEN CAST(b AS BOOLEAN) THEN d WHEN CAST(e AS = BOOLEAN) THEN f ELSE 999 END, b, c, d FROM t1 ]], { -- 13, 11, 12, 13, 23, 21, 22, 23 @@ -112,7 +112,7 @@ test:do_execsql_test( test:do_execsql_test( "cse-1.6.4", [[ - SELECT b, c, d, CASE WHEN b THEN d WHEN e THEN f ELSE 999 END = FROM t1 + SELECT b, c, d, CASE WHEN CAST(b AS BOOLEAN) THEN d WHEN CAST(e = AS BOOLEAN) THEN f ELSE 999 END FROM t1 ]], { -- 11, 12, 13, 13, 21, 22, 23, 23 @@ -122,7 +122,7 @@ test:do_execsql_test( test:do_execsql_test( "cse-1.6.5", [[ - SELECT b, c, d, CASE WHEN 0 THEN d WHEN e THEN f ELSE 999 END = FROM t1 + SELECT b, c, d, CASE WHEN false THEN d WHEN CAST(e AS BOOLEAN) = THEN f ELSE 999 END FROM t1 ]], { -- 11, 12, 13, 15, 21, 22, 23, 25 diff --git a/test/sql-tap/e_select1.test.lua = b/test/sql-tap/e_select1.test.lua index 9e9c4c551..c4724e636 100755 --- a/test/sql-tap/e_select1.test.lua +++ b/test/sql-tap/e_select1.test.lua @@ -703,7 +703,7 @@ test:do_execsql_test( test:do_execsql_test( "e_select-3.2.1b", [[ - SELECT k FROM x1 LEFT JOIN x2 USING(k) WHERE x2.k + SELECT k FROM x1 LEFT JOIN x2 USING(k) WHERE x2.k <> 0 ]], { -- 1, 3, 5 diff --git a/test/sql-tap/orderby1.test.lua = b/test/sql-tap/orderby1.test.lua index ea03c494b..51e8d301f 100755 --- a/test/sql-tap/orderby1.test.lua +++ b/test/sql-tap/orderby1.test.lua @@ -735,7 +735,7 @@ test:do_execsql_test( SELECT ( SELECT 'hardware' FROM (=20 SELECT 'software' ORDER BY 'firmware' ASC, 'sportswear' = DESC - ) GROUP BY 1 HAVING length(b) + ) GROUP BY 1 HAVING length(b) <> 0 ) FROM abc; ]], { diff --git a/test/sql-tap/select1.test.lua = b/test/sql-tap/select1.test.lua index 1bad7679f..6beeb34cb 100755 --- a/test/sql-tap/select1.test.lua +++ b/test/sql-tap/select1.test.lua @@ -1492,13 +1492,13 @@ test:do_catchsql_test( -- }) =20 -test:do_execsql_test( +test:do_catchsql_test( "select1-8.1", [[ SELECT f1 FROM test1 WHERE 4.3+2.4 OR 1 ORDER BY f1 ]], { -- - 11, 33 + 1, 'Type mismatch: can not convert 6.7 to boolean' -- }) =20 diff --git a/test/sql-tap/select2.test.lua = b/test/sql-tap/select2.test.lua index 7f0188d6f..e08c8b3b6 100755 --- a/test/sql-tap/select2.test.lua +++ b/test/sql-tap/select2.test.lua @@ -222,7 +222,7 @@ test:do_execsql_test( test:do_execsql_test( "select2-4.3", [[ - SELECT * FROM aa CROSS JOIN bb WHERE NOT b; + SELECT * FROM aa CROSS JOIN bb WHERE NOT b <> 0; ]], { -- 1, 0, 3, 0 @@ -242,7 +242,7 @@ test:do_execsql_test( test:do_execsql_test( "select2-4.5", [[ - SELECT * FROM aa, bb WHERE NOT min(a,b); + SELECT * FROM aa, bb WHERE NOT min(a,b) <> 0; ]], {