[tarantool-patches] [PATCH 0/2] Improve operability of typeof() function

Tarantool development patches archive
 help / color / mirror / Atom feed

* [tarantool-patches] [PATCH 0/2] Improve operability of typeof() function
@ 2019-07-27 18:45 Nikita Pettik
  2019-07-27 18:45 ` [tarantool-patches] [PATCH 1/2] sql: extend struct Mem with field_type member Nikita Pettik
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Nikita Pettik @ 2019-07-27 18:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy, Nikita Pettik

Branch: https://github.com/tarantool/tarantool/compare/np/gh-4148-add-field-type-to-mem
Issue: https://github.com/tarantool/tarantool/issues/4148

First patch adds field_type member to struct Mem. It allows to
improve type calculation for NULL values: when value is fetched
from tuple, field_type is assigned to the type of corresponding
field in space format. So that NULL from field with INTEGER type
now has INTEGER type, from REAL - REAL etc. The only exception is
SCALAR type since it's not basic type but rather aggregation of
types. Thus, NULL values from SCALAR field features default type
(in the next patch it is changed to BOOLEAN).

Second patch switches default type of NULL literal to boolean as it
was decided to do during numerous discussions in mailing list. See
commit message for details.

Nikita Pettik (2):
  sql: extend struct Mem with field_type member
  sql: make default type of NULL be boolean

 src/box/sql/func.c          | 14 +++++++++++++-
 src/box/sql/vdbe.c          | 36 +++++++++++++++++++++++++++++++++++-
 src/box/sql/vdbeInt.h       | 11 +++++++++++
 src/box/sql/vdbeapi.c       |  1 +
 src/box/sql/vdbeaux.c       |  1 +
 src/box/sql/vdbemem.c       |  3 +++
 src/box/sql/vdbesort.c      |  6 ++++++
 test/sql-tap/cast.test.lua  | 12 ++++++------
 test/sql-tap/func.test.lua  | 12 ++++++------
 test/sql-tap/table.test.lua |  4 ++--
 test/sql/types.result       | 35 ++++++++++++++++++++++++++++++++++-
 test/sql/types.test.lua     | 11 +++++++++++
 12 files changed, 129 insertions(+), 17 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-07-27 18:45 [tarantool-patches] [PATCH 0/2] Improve operability of typeof() function Nikita Pettik
@ 2019-07-27 18:45 ` Nikita Pettik
  2019-07-28 15:22   ` [tarantool-patches] " Vladislav Shpilevoy
  2019-07-27 18:45 ` [tarantool-patches] [PATCH 2/2] sql: make default type of NULL be boolean Nikita Pettik
  2019-08-02 10:52 ` [tarantool-patches] Re: [PATCH 0/2] Improve operability of typeof() function Kirill Yukhin
  2 siblings, 1 reply; 11+ messages in thread
From: Nikita Pettik @ 2019-07-27 18:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy, Nikita Pettik

There's several reasons to do so. First one is to improve operability of
built-in function 'typeof()' which returns string containing argument's
type. Using only format (msgpack) type we can't determine real field
type of null values. Moreover, result of CAST should feature the same
type as it was specified in operation. For instance:
typeof(CAST(0 AS INTEGER)) should return "integer", not "unsigned".

The second reason is different rules for comparison involving values of
SCALAR type. In Tarantool NoSQL it is allowed to compare values of
"incompatible" types like booleans and strings (using heuristics which
says that values of one type always are greater/less that values of
another type), in case they are stored in SCALAR field type. Without
storing actual field type in struct Mem it is obvious impossible to
achieve.

Thus, now field_type member in struct Mem represents supposed field type
in case value is fetched from tuple or is a result of CAST operation.

Closes #4148
---
 src/box/sql/func.c          | 11 +++++++++++
 src/box/sql/vdbe.c          | 36 +++++++++++++++++++++++++++++++++++-
 src/box/sql/vdbeInt.h       | 11 +++++++++++
 src/box/sql/vdbeapi.c       |  1 +
 src/box/sql/vdbeaux.c       |  1 +
 src/box/sql/vdbemem.c       |  3 +++
 src/box/sql/vdbesort.c      |  6 ++++++
 test/sql-tap/cast.test.lua  |  8 ++++----
 test/sql-tap/table.test.lua |  4 ++--
 test/sql/types.result       | 35 ++++++++++++++++++++++++++++++++++-
 test/sql/types.test.lua     | 11 +++++++++++
 11 files changed, 119 insertions(+), 8 deletions(-)

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index fcf147c96..f50df105d 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -107,6 +107,17 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv)
 {
 	const char *z = 0;
 	UNUSED_PARAMETER(NotUsed);
+	enum field_type f_t = argv[0]->field_type;
+	/*
+	 * SCALAR is not a basic type, but rather an aggregation of
+	 * types. Thus, ignore SCALAR field type and return msgpack
+	 * format type.
+	 */
+	if (f_t != field_type_MAX && f_t != FIELD_TYPE_SCALAR) {
+		sql_result_text(context, field_type_strs[argv[0]->field_type],
+				-1, SQL_STATIC);
+		return;
+	}
 	switch (sql_value_type(argv[0])) {
 	case MP_INT:
 	case MP_UINT:
diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index 9142932e9..2c7a1c2da 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -1238,6 +1238,7 @@ case OP_SoftNull: {
 	assert(pOp->p1>0 && pOp->p1<=(p->nMem+1 - p->nCursor));
 	pOut = &aMem[pOp->p1];
 	pOut->flags = (pOut->flags|MEM_Null)&~MEM_Undefined;
+	pOut->field_type = field_type_MAX;
 	break;
 }
 
@@ -1278,6 +1279,11 @@ case OP_Variable: {            /* out2 */
 	}
 	pOut = out2Prerelease(p, pOp);
 	sqlVdbeMemShallowCopy(pOut, pVar, MEM_Static);
+	/*
+	 * ...ShallowCopy() may overwrite field_type with irrelevant
+	 * value. So make sure that in the end field_type_MAX is set.
+	 */
+	pOut->field_type = field_type_MAX;
 	UPDATE_MAX_BLOBSIZE(pOut);
 	break;
 }
@@ -1338,6 +1344,7 @@ case OP_Copy: {
 	pIn1 = &aMem[pOp->p1];
 	pOut = &aMem[pOp->p2];
 	assert(pOut!=pIn1);
+
 	while( 1) {
 		sqlVdbeMemShallowCopy(pOut, pIn1, MEM_Ephem);
 		Deephemeralize(pOut);
@@ -1389,6 +1396,7 @@ case OP_IntCopy: {            /* out2 */
 	assert((pIn1->flags & (MEM_Int | MEM_UInt)) != 0);
 	pOut = &aMem[pOp->p2];
 	mem_set_int(pOut, pIn1->u.i, (pIn1->flags & MEM_Int) != 0);
+	pOut->field_type = field_type_MAX;
 	break;
 }
 
@@ -1491,6 +1499,8 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
 	assert(pIn1!=pOut);
 	if ((pIn1->flags | pIn2->flags) & MEM_Null) {
 		sqlVdbeMemSetNull(pOut);
+		/* Force NULL be of type STRING. */
+		pOut->field_type = FIELD_TYPE_STRING;
 		break;
 	}
 	/*
@@ -1535,6 +1545,7 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
 	pOut->z[nByte+1] = 0;
 	pOut->flags |= MEM_Term;
 	pOut->n = (int)nByte;
+	pOut->field_type = field_type_MAX;
 	UPDATE_MAX_BLOBSIZE(pOut);
 	break;
 }
@@ -1644,6 +1655,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
 		}
 		}
 		mem_set_int(pOut, iB, is_res_neg);
+		pOut->field_type = field_type_MAX;
 	} else {
 		bIntint = 0;
 		if (sqlVdbeRealValue(pIn1, &rA) != 0) {
@@ -1681,6 +1693,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
 		}
 		pOut->u.r = rB;
 		MemSetTypeFlag(pOut, MEM_Real);
+		pOut->field_type = field_type_MAX;
 		if (((type1|type2)&MEM_Real)==0 && !bIntint) {
 			mem_apply_integer_type(pOut);
 		}
@@ -1689,6 +1702,8 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
 
 arithmetic_result_is_null:
 	sqlVdbeMemSetNull(pOut);
+	/* Force NULL be of type NUMBER. */
+	pOut->field_type = FIELD_TYPE_NUMBER;
 	break;
 
 division_by_zero:
@@ -1909,6 +1924,7 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
 		}
 	}
 	mem_set_i64(pOut, iA);
+	pOut->field_type = field_type_MAX;
 	break;
 }
 
@@ -1988,6 +2004,13 @@ case OP_Cast: {                  /* in1 */
 	if (ExpandBlob(pIn1) != 0)
 		goto abort_due_to_error;
 	rc = sqlVdbeMemCast(pIn1, pOp->p2);
+	/*
+	 * SCALAR is not type itself, but rather an aggregation
+	 * of types. Hence, cast to this type shouldn't change
+	 * original type of argument.
+	 */
+	if (pOp->p2 != FIELD_TYPE_SCALAR)
+		pIn1->field_type = pOp->p2;
 	UPDATE_MAX_BLOBSIZE(pIn1);
 	if (rc == 0)
 		break;
@@ -2094,6 +2117,8 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
 	pIn3 = &aMem[pOp->p3];
 	flags1 = pIn1->flags;
 	flags3 = pIn3->flags;
+	enum field_type ft_p1 = pIn1->field_type;
+	enum field_type ft_p3 = pIn3->field_type;
 	if ((flags1 | flags3)&MEM_Null) {
 		/* One or both operands are NULL */
 		if (pOp->p5 & SQL_NULLEQ) {
@@ -2232,8 +2257,10 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
 	/* Undo any changes made by mem_apply_type() to the input registers. */
 	assert((pIn1->flags & MEM_Dyn) == (flags1 & MEM_Dyn));
 	pIn1->flags = flags1;
+	pIn1->field_type = ft_p1;
 	assert((pIn3->flags & MEM_Dyn) == (flags3 & MEM_Dyn));
 	pIn3->flags = flags3;
+	pIn3->field_type = ft_p3;
 
 	if (pOp->p5 & SQL_STOREP2) {
 		pOut = &aMem[pOp->p2];
@@ -2451,6 +2478,7 @@ case OP_Or: {             /* same as TK_OR, in1, in2, out3 */
 	} else {
 		mem_set_bool(pOut, v1);
 	}
+	pOut->field_type = field_type_MAX;
 	break;
 }
 
@@ -2472,6 +2500,7 @@ case OP_Not: {                /* same as TK_NOT, in1, out2 */
 			goto abort_due_to_error;
 		}
 		mem_set_bool(pOut, ! pIn1->u.b);
+		pOut->field_type = field_type_MAX;
 	}
 	break;
 }
@@ -2645,7 +2674,7 @@ case OP_Column: {
 		}
 		pC->cacheStatus = p->cacheCtr;
 	}
-	enum field_type field_type = FIELD_TYPE_ANY;
+	enum field_type field_type = field_type_MAX;
 	if (pC->eCurType == CURTYPE_TARANTOOL) {
 		/*
 		 * Ephemeral spaces feature only one index
@@ -2660,6 +2689,8 @@ case OP_Column: {
 			field_type = pC->uc.pCursor->space->def->
 					fields[p2].type;
 		}
+	} else if (pC->eCurType == CURTYPE_SORTER) {
+		field_type = vdbe_sorter_get_field_type(pC->uc.pSorter, p2);
 	}
 	struct Mem *default_val_mem =
 		pOp->p4type == P4_MEM ? pOp->p4.pMem : NULL;
@@ -2679,6 +2710,7 @@ case OP_Column: {
 				sqlVdbeMemSetDouble(pDest, pDest->u.u);
 		}
 	}
+	pDest->field_type = field_type;
 op_column_out:
 	REGISTER_TRACE(p, pOp->p3, pDest);
 	break;
@@ -3750,6 +3782,7 @@ case OP_FCopy: {     /* out2 */
 	if ((pOp->p3 & OPFLAG_NOOP_IF_NULL) && (pIn1->flags & MEM_Null)) {
 		pOut = &aMem[pOp->p2];
 		if (pOut->flags & MEM_Undefined) pOut->flags = MEM_Null;
+		pOut->field_type = field_type_MAX;
 		/* Flag is set and register is NULL -> do nothing  */
 	} else {
 		assert(memIsValid(pIn1));
@@ -3757,6 +3790,7 @@ case OP_FCopy: {     /* out2 */
 
 		pOut = &aMem[pOp->p2];
 		mem_set_int(pOut, pIn1->u.i, pIn1->flags == MEM_Int);
+		pOut->field_type = pIn1->field_type;
 	}
 	break;
 }
diff --git a/src/box/sql/vdbeInt.h b/src/box/sql/vdbeInt.h
index b83926a66..e98e01c8e 100644
--- a/src/box/sql/vdbeInt.h
+++ b/src/box/sql/vdbeInt.h
@@ -174,6 +174,13 @@ struct Mem {
 	u32 flags;		/* Some combination of MEM_Null, MEM_Str, MEM_Dyn, etc. */
 	/** Subtype for this value. */
 	enum sql_subtype subtype;
+	/**
+	 * If value is fetched from tuple, then this property
+	 * contains type of corresponding space's field. If it's
+	 * value field_type_MAX then we can rely on on format
+	 * (msgpack) type which is represented by 'flags'.
+	 */
+	enum field_type field_type;
 	int n;			/* size (in bytes) of string value, excluding trailing '\0' */
 	char *z;		/* String or BLOB value */
 	/* ShallowCopy only needs to copy the information above */
@@ -518,6 +525,10 @@ int sqlVdbeFrameRestore(VdbeFrame *);
 
 int sqlVdbeSorterInit(struct sql *db, struct VdbeCursor *cursor);
 void sqlVdbeSorterReset(sql *, VdbeSorter *);
+
+enum field_type
+vdbe_sorter_get_field_type(struct VdbeSorter *sorter, uint32_t field_no);
+
 void sqlVdbeSorterClose(sql *, VdbeCursor *);
 int sqlVdbeSorterRowkey(const VdbeCursor *, Mem *);
 int sqlVdbeSorterNext(sql *, const VdbeCursor *, int *);
diff --git a/src/box/sql/vdbeapi.c b/src/box/sql/vdbeapi.c
index ecf1b3601..c62cb7a1e 100644
--- a/src/box/sql/vdbeapi.c
+++ b/src/box/sql/vdbeapi.c
@@ -620,6 +620,7 @@ columnNullValue(void)
 		0},
 		    /* .flags      = */ (u16) MEM_Null,
 		    /* .eSubtype   = */ (u8) 0,
+		    /* .field_type = */ field_type_MAX,
 		    /* .n          = */ (int)0,
 		    /* .z          = */ (char *)0,
 		    /* .zMalloc    = */ (char *)0,
diff --git a/src/box/sql/vdbeaux.c b/src/box/sql/vdbeaux.c
index e153c150f..e9adf7bfa 100644
--- a/src/box/sql/vdbeaux.c
+++ b/src/box/sql/vdbeaux.c
@@ -1280,6 +1280,7 @@ initMemArray(Mem * p, int N, sql * db, u32 flags)
 		p->db = db;
 		p->flags = flags;
 		p->szMalloc = 0;
+		p->field_type = field_type_MAX;
 #ifdef SQL_DEBUG
 		p->pScopyFrom = 0;
 #endif
diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
index 847a6b0ce..e5941de8e 100644
--- a/src/box/sql/vdbemem.c
+++ b/src/box/sql/vdbemem.c
@@ -327,6 +327,7 @@ sqlVdbeMemFinalize(Mem * pMem, FuncDef * pFunc)
 		memset(&t, 0, sizeof(t));
 		t.flags = MEM_Null;
 		t.db = pMem->db;
+		t.field_type = field_type_MAX;
 		ctx.pOut = &t;
 		ctx.pMem = pMem;
 		ctx.pFunc = pFunc;
@@ -747,6 +748,7 @@ sqlVdbeMemInit(Mem * pMem, sql * db, u32 flags)
 	pMem->flags = flags;
 	pMem->db = db;
 	pMem->szMalloc = 0;
+	pMem->field_type = field_type_MAX;
 }
 
 /*
@@ -892,6 +894,7 @@ sqlVdbeMemAboutToChange(Vdbe * pVdbe, Mem * pMem)
 	for (i = 0, pX = pVdbe->aMem; i < pVdbe->nMem; i++, pX++) {
 		if (pX->pScopyFrom == pMem) {
 			pX->flags |= MEM_Undefined;
+			pX->field_type = field_type_MAX;
 			pX->pScopyFrom = 0;
 		}
 	}
diff --git a/src/box/sql/vdbesort.c b/src/box/sql/vdbesort.c
index fc89ee721..a2d681255 100644
--- a/src/box/sql/vdbesort.c
+++ b/src/box/sql/vdbesort.c
@@ -968,6 +968,12 @@ sqlVdbeSorterReset(sql * db, VdbeSorter * pSorter)
 	pSorter->pUnpacked = 0;
 }
 
+enum field_type
+vdbe_sorter_get_field_type(struct VdbeSorter *sorter, uint32_t field_no)
+{
+	return sorter->key_def->parts[field_no].type;
+}
+
 /*
  * Free any cursor components allocated by sqlVdbeSorterXXX routines.
  */
diff --git a/test/sql-tap/cast.test.lua b/test/sql-tap/cast.test.lua
index 9fc4ff11f..f06cc81d6 100755
--- a/test/sql-tap/cast.test.lua
+++ b/test/sql-tap/cast.test.lua
@@ -140,7 +140,7 @@ test:do_execsql_test(
         SELECT typeof(CAST(NULL AS text))
     ]], {
         -- <cast-1.14>
-        "null"
+        "string"
         -- </cast-1.14>
     })
 
@@ -160,7 +160,7 @@ test:do_execsql_test(
         SELECT typeof(CAST(NULL AS FLOAT))
     ]], {
         -- <cast-1.16>
-        "null"
+        "number"
         -- </cast-1.16>
     })
 
@@ -200,7 +200,7 @@ test:do_execsql_test(
         SELECT typeof(CAST(NULL AS integer))
     ]], {
         -- <cast-1.20>
-        "null"
+        "integer"
         -- </cast-1.20>
     })
 
@@ -511,7 +511,7 @@ test:do_execsql_test(
         SELECT typeof(CAST(null AS REAL))
     ]], {
         -- <case-1.61>
-        "null"
+        "number"
         -- </case-1.61>
     })
 
diff --git a/test/sql-tap/table.test.lua b/test/sql-tap/table.test.lua
index a539f5222..7f4b15e72 100755
--- a/test/sql-tap/table.test.lua
+++ b/test/sql-tap/table.test.lua
@@ -906,7 +906,7 @@ test:do_execsql_test(
         FROM t7 LIMIT 1;
     ]], {
         -- <table-11.1>
-        "integer", "null", "null", "null", "null", "null", "null", "null"
+        "integer", "number", "string", "string", "null", "null", "string", "string"
         -- </table-11.1>
     })
 
@@ -917,7 +917,7 @@ test:do_execsql_test(
         FROM t7 LIMIT 1;
     ]], {
         -- <table-11.2>
-        "null", "null", "null", "null"
+        "number", "string", "number", "string"
         -- </table-11.2>
     })
 
diff --git a/test/sql/types.result b/test/sql/types.result
index 4589b2d58..6e7fffa42 100644
--- a/test/sql/types.result
+++ b/test/sql/types.result
@@ -636,7 +636,7 @@ box.execute("SELECT typeof(b) FROM t;")
   rows:
   - ['boolean']
   - ['boolean']
-  - ['null']
+  - ['boolean']
 ...
 box.execute("SELECT quote(b) FROM t;")
 ---
@@ -1751,3 +1751,36 @@ box.execute("SELECT CAST('-123' AS UNSIGNED);")
 box.space.T1:drop()
 ---
 ...
+-- gh-4148: make sure that typeof() returns origin type of column
+-- even if value is null.
+--
+box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);")
+---
+- row_count: 1
+...
+box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);")
+---
+- row_count: 2
+...
+box.execute("SELECT typeof(a), typeof(s) FROM t;")
+---
+- metadata:
+  - name: typeof(a)
+    type: string
+  - name: typeof(s)
+    type: string
+  rows:
+  - ['integer', 'integer']
+  - ['integer', 'null']
+...
+box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
+---
+- metadata:
+  - name: typeof(CAST(0 AS UNSIGNED))
+    type: string
+  rows:
+  - ['unsigned']
+...
+box.space.T:drop()
+---
+...
diff --git a/test/sql/types.test.lua b/test/sql/types.test.lua
index f07a90b37..489a5a91c 100644
--- a/test/sql/types.test.lua
+++ b/test/sql/types.test.lua
@@ -399,3 +399,14 @@ box.execute("SELECT CAST('123' AS UNSIGNED);")
 box.execute("SELECT CAST('-123' AS UNSIGNED);")
 
 box.space.T1:drop()
+
+-- gh-4148: make sure that typeof() returns origin type of column
+-- even if value is null.
+--
+box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);")
+box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);")
+box.execute("SELECT typeof(a), typeof(s) FROM t;")
+
+box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
+
+box.space.T:drop()
-- 
2.15.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-07-27 18:45 ` [tarantool-patches] [PATCH 1/2] sql: extend struct Mem with field_type member Nikita Pettik
@ 2019-07-28 15:22   ` Vladislav Shpilevoy
  2019-07-28 21:26     ` Konstantin Osipov
  2019-07-31  9:14     ` n.pettik
  0 siblings, 2 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-07-28 15:22 UTC (permalink / raw)
  To: tarantool-patches, Nikita Pettik

Hi! Thanks for the patch! See 9 comments below.

On 27/07/2019 20:45, Nikita Pettik wrote:
> There's several reasons to do so. First one is to improve operability of
> built-in function 'typeof()' which returns string containing argument's
> type. Using only format (msgpack) type we can't determine real field
> type of null values. Moreover, result of CAST should feature the same
> type as it was specified in operation. For instance:
> typeof(CAST(0 AS INTEGER)) should return "integer", not "unsigned".
> 
> The second reason is different rules for comparison involving values of
> SCALAR type. In Tarantool NoSQL it is allowed to compare values of
> "incompatible" types like booleans and strings (using heuristics which
> says that values of one type always are greater/less that values of
> another type), in case they are stored in SCALAR field type. Without
> storing actual field type in struct Mem it is obvious impossible to
> achieve.
> 
> Thus, now field_type member in struct Mem represents supposed field type
> in case value is fetched from tuple or is a result of CAST operation.
> 
> Closes #4148
> ---
>  src/box/sql/func.c          | 11 +++++++++++
>  src/box/sql/vdbe.c          | 36 +++++++++++++++++++++++++++++++++++-
>  src/box/sql/vdbeInt.h       | 11 +++++++++++
>  src/box/sql/vdbeapi.c       |  1 +
>  src/box/sql/vdbeaux.c       |  1 +
>  src/box/sql/vdbemem.c       |  3 +++
>  src/box/sql/vdbesort.c      |  6 ++++++
>  test/sql-tap/cast.test.lua  |  8 ++++----
>  test/sql-tap/table.test.lua |  4 ++--
>  test/sql/types.result       | 35 ++++++++++++++++++++++++++++++++++-
>  test/sql/types.test.lua     | 11 +++++++++++
>  11 files changed, 119 insertions(+), 8 deletions(-)
> 
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index fcf147c96..f50df105d 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -107,6 +107,17 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv)
>  {
>  	const char *z = 0;
>  	UNUSED_PARAMETER(NotUsed);
> +	enum field_type f_t = argv[0]->field_type;

1. Perhaps, field_type is not the best name, because Mem is not always
is a field. It can be just a value, not from a space. It is rather
just 'type'.

> +	/*
> +	 * SCALAR is not a basic type, but rather an aggregation of
> +	 * types. Thus, ignore SCALAR field type and return msgpack
> +	 * format type.
> +	 */
> +	if (f_t != field_type_MAX && f_t != FIELD_TYPE_SCALAR) {

2. Not sure if I understand, why do you need both _MAX and _SCALAR for
values with no strict type. Why not to set SCALAR instead of MAX always?
MAX is not a type, so SCALAR IMO would be more appropriate when you
don't have a strict type. Or even ANY, because maps and arrays will be
supported.

> +		sql_result_text(context, field_type_strs[argv[0]->field_type],
> +				-1, SQL_STATIC);
> +		return;
> +	}
>  	switch (sql_value_type(argv[0])) {
>  	case MP_INT:
>  	case MP_UINT:
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index 9142932e9..2c7a1c2da 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -1238,6 +1238,7 @@ case OP_SoftNull: {
>  	assert(pOp->p1>0 && pOp->p1<=(p->nMem+1 - p->nCursor));
>  	pOut = &aMem[pOp->p1];
>  	pOut->flags = (pOut->flags|MEM_Null)&~MEM_Undefined;
> +	pOut->field_type = field_type_MAX;
>  	break;

3. This opcode is unused and can be deleted.

>  }
>  
> @@ -1278,6 +1279,11 @@ case OP_Variable: {            /* out2 */
>  	}
>  	pOut = out2Prerelease(p, pOp);
>  	sqlVdbeMemShallowCopy(pOut, pVar, MEM_Static);
> +	/*
> +	 * ...ShallowCopy() may overwrite field_type with irrelevant
> +	 * value. So make sure that in the end field_type_MAX is set.
> +	 */

4. The only place where ShallowCopy can rewrite field_type is
memcpy(pTo, pFrom, MEMCELLSIZE). If after that field_type is irrelevant,
then it was incorrect in pFrom, i.e. in pVar. On the summary:
why pVar may have incorrect field_type? Should not bind set a correct
type?

> +	pOut->field_type = field_type_MAX;
>  	UPDATE_MAX_BLOBSIZE(pOut);
>  	break;
>  }
> @@ -1338,6 +1344,7 @@ case OP_Copy: {
>  	pIn1 = &aMem[pOp->p1];
>  	pOut = &aMem[pOp->p2];
>  	assert(pOut!=pIn1);
> +
>  	while( 1) {

5. Unnecessary diff.

>  		sqlVdbeMemShallowCopy(pOut, pIn1, MEM_Ephem);
>  		Deephemeralize(pOut);
> @@ -1389,6 +1396,7 @@ case OP_IntCopy: {            /* out2 */
>  	assert((pIn1->flags & (MEM_Int | MEM_UInt)) != 0);
>  	pOut = &aMem[pOp->p2];
>  	mem_set_int(pOut, pIn1->u.i, (pIn1->flags & MEM_Int) != 0);
> +	pOut->field_type = field_type_MAX;

6. Why mem_set_int does not set a correct field_type? Or why
pOut->field_type can't be assigned to pIn->field_type? You actually
do it for OP_FCopy. The same about OP_Remainder.

When you assign field_type_MAX, and pIn1 was FIELD_TYPE_INTEGER
with an unsigned value, you will have typeof(pOut) unsigned - type
loss.

Additionally, as I see, OP_IntCopy is used only once, when a space
is being created, to copy sequence id. In such a case I think it
can be deleted and replaced with OP_SCopy.

>  	break;
>  }
>  
> @@ -1491,6 +1499,8 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
>  	assert(pIn1!=pOut);
>  	if ((pIn1->flags | pIn2->flags) & MEM_Null) {
>  		sqlVdbeMemSetNull(pOut);
> +		/* Force NULL be of type STRING. */
> +		pOut->field_type = FIELD_TYPE_STRING;
>  		break;
>  	}
>  	/*
> @@ -1535,6 +1545,7 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
>  	pOut->z[nByte+1] = 0;
>  	pOut->flags |= MEM_Term;
>  	pOut->n = (int)nByte;
> +	pOut->field_type = field_type_MAX;
>  	UPDATE_MAX_BLOBSIZE(pOut);
>  	break;
>  }
> @@ -1644,6 +1655,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
>  		}
>  		}
>  		mem_set_int(pOut, iB, is_res_neg);
> +		pOut->field_type = field_type_MAX;
>  	} else {
>  		bIntint = 0;
>  		if (sqlVdbeRealValue(pIn1, &rA) != 0) {
> @@ -1681,6 +1693,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
>  		}
>  		pOut->u.r = rB;
>  		MemSetTypeFlag(pOut, MEM_Real);
> +		pOut->field_type = field_type_MAX;
>  		if (((type1|type2)&MEM_Real)==0 && !bIntint) {
>  			mem_apply_integer_type(pOut);
>  		}
> @@ -1689,6 +1702,8 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
>  
>  arithmetic_result_is_null:
>  	sqlVdbeMemSetNull(pOut);
> +	/* Force NULL be of type NUMBER. */
> +	pOut->field_type = FIELD_TYPE_NUMBER;
>  	break;

7. All these manual assignments of field_type are broken by design.
When you have to update field_type after each operation manually,
you will miss some places for sure. For example:

tarantool> box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
---
- row_count: 1
...

tarantool> box.execute('INSERT INTO test VALUES (1, NULL, NULL)')
---
- row_count: 1
...

tarantool> box.execute('SELECT typeof(a & b) FROM test')
---
- metadata:
  - name: typeof(a & b)
    type: string
  rows:
  - ['null']
...

tarantool> box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM test')
---
- metadata:
  - name: typeof(a)
    type: string
  - name: typeof(b)
    type: string
  - name: typeof(a & b)
    type: string
  rows:
  - ['integer', 'integer', 'integer']
...


Here in the last two queries 'typeof(a & b)' result *depends* on
other columns in result set. But it should be integer always, shouldn't
it? The same about all other places, where you call mem_set_* and don't
update field_type.

I propose you to redesign how to update field_type, so as it would
be hard to forget to update it.

For instance, add a field_type argument to sqlVdbeMemSetNull() to never
miss to set a real type for a NULL value. Make mem_set_int update field_type
depending on the positivity flag. Make mem_set_u64 always set field_type
UNSIGNED. Make sqlVdbeMemCast always set field_type to the target type.

>  
>  division_by_zero:
> @@ -1909,6 +1924,7 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>  		}
>  	}
>  	mem_set_i64(pOut, iA);
> +	pOut->field_type = field_type_MAX;
>  	break;
>  }
>  
> @@ -1988,6 +2004,13 @@ case OP_Cast: {                  /* in1 */
>  	if (ExpandBlob(pIn1) != 0)
>  		goto abort_due_to_error;
>  	rc = sqlVdbeMemCast(pIn1, pOp->p2);
> +	/*
> +	 * SCALAR is not type itself, but rather an aggregation
> +	 * of types. Hence, cast to this type shouldn't change
> +	 * original type of argument.
> +	 */
> +	if (pOp->p2 != FIELD_TYPE_SCALAR)
> +		pIn1->field_type = pOp->p2;

8. Why sqlVdbeMemCast does not update the type inside? Now
this function works on half - it updates flags with a new type,
sets a new value, but doesn't touch field_type.

>  	UPDATE_MAX_BLOBSIZE(pIn1);
>  	if (rc == 0)
>  		break;
> @@ -2094,6 +2117,8 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
>  	pIn3 = &aMem[pOp->p3];
>  	flags1 = pIn1->flags;
>  	flags3 = pIn3->flags;
> +	enum field_type ft_p1 = pIn1->field_type;
> +	enum field_type ft_p3 = pIn3->field_type;
>  	if ((flags1 | flags3)&MEM_Null) {
>  		/* One or both operands are NULL */
>  		if (pOp->p5 & SQL_NULLEQ) {
> @@ -2232,8 +2257,10 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
>  	/* Undo any changes made by mem_apply_type() to the input registers. */
>  	assert((pIn1->flags & MEM_Dyn) == (flags1 & MEM_Dyn));
>  	pIn1->flags = flags1;
> +	pIn1->field_type = ft_p1;
>  	assert((pIn3->flags & MEM_Dyn) == (flags3 & MEM_Dyn));
>  	pIn3->flags = flags3;
> +	pIn3->field_type = ft_p3;
>  
>  	if (pOp->p5 & SQL_STOREP2) {
>  		pOut = &aMem[pOp->p2];
> @@ -2451,6 +2478,7 @@ case OP_Or: {             /* same as TK_OR, in1, in2, out3 */
>  	} else {
>  		mem_set_bool(pOut, v1);
>  	}
> +	pOut->field_type = field_type_MAX;
>  	break;
>  }
>  
> @@ -2472,6 +2500,7 @@ case OP_Not: {                /* same as TK_NOT, in1, out2 */
>  			goto abort_due_to_error;
>  		}
>  		mem_set_bool(pOut, ! pIn1->u.b);
> +		pOut->field_type = field_type_MAX;

9. So after mem_set_bool, if the cell was used for
something else before, it is in an invalid state, right?
I don't think it should work so. Please, develop a
better way how to keep field_type up to date.

Ideally, if code doesn't touch type flags, it should not touch
field_type. Otherwise that place is broken, if you need to
do follow-up updates after each function, doing something
with types.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-07-28 15:22   ` [tarantool-patches] " Vladislav Shpilevoy
@ 2019-07-28 21:26     ` Konstantin Osipov
  2019-07-31  9:14     ` n.pettik
  1 sibling, 0 replies; 11+ messages in thread
From: Konstantin Osipov @ 2019-07-28 21:26 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Nikita Pettik

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/07/28 21:49]:
> >  	const char *z = 0;
> >  	UNUSED_PARAMETER(NotUsed);
> > +	enum field_type f_t = argv[0]->field_type;
> 
> 1. Perhaps, field_type is not the best name, because Mem is not always
> is a field. It can be just a value, not from a space. It is rather
> just 'type'.
> 

I think field_type is good because it's clear it's derived from
enum field_type. Mem has more than one type now, so just "type" is
ambiguous. My 0.02

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-07-28 15:22   ` [tarantool-patches] " Vladislav Shpilevoy
  2019-07-28 21:26     ` Konstantin Osipov
@ 2019-07-31  9:14     ` n.pettik
  2019-07-31 16:59       ` Vladislav Shpilevoy
  1 sibling, 1 reply; 11+ messages in thread
From: n.pettik @ 2019-07-31  9:14 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Vladislav Shpilevoy



>> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
>> index fcf147c96..f50df105d 100644
>> --- a/src/box/sql/func.c
>> +++ b/src/box/sql/func.c
>> @@ -107,6 +107,17 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv)
>> {
>> 	const char *z = 0;
>> 	UNUSED_PARAMETER(NotUsed);
>> +	enum field_type f_t = argv[0]->field_type;
> 
> 1. Perhaps, field_type is not the best name, because Mem is not always
> is a field. It can be just a value, not from a space. It is rather
> just 'type’.

When memory represents smth different from value fetched from
tuple, it is set to field_type_MAX. So I’d rather not rename this member.

>> +	/*
>> +	 * SCALAR is not a basic type, but rather an aggregation of
>> +	 * types. Thus, ignore SCALAR field type and return msgpack
>> +	 * format type.
>> +	 */
>> +	if (f_t != field_type_MAX && f_t != FIELD_TYPE_SCALAR) {
> 
> 2. Not sure if I understand, why do you need both _MAX and _SCALAR for
> values with no strict type. Why not to set SCALAR instead of MAX always?
> MAX is not a type, so SCALAR IMO would be more appropriate when you
> don't have a strict type. Or even ANY, because maps and arrays will be
> supported.

Because SCALAR and ANY are real field types. AFAIR we decided
to introduce different comparison rules for values from SCALAR field
and values from basic type fields. So I’m not sure that setting SCALAR
type by default is a good idea.

>> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
>> index 9142932e9..2c7a1c2da 100644
>> --- a/src/box/sql/vdbe.c
>> +++ b/src/box/sql/vdbe.c
>> @@ -1238,6 +1238,7 @@ case OP_SoftNull: {
>> 	assert(pOp->p1>0 && pOp->p1<=(p->nMem+1 - p->nCursor));
>> 	pOut = &aMem[pOp->p1];
>> 	pOut->flags = (pOut->flags|MEM_Null)&~MEM_Undefined;
>> +	pOut->field_type = field_type_MAX;
>> 	break;
> 
> 3. This opcode is unused and can be deleted.

Ok, moved this change to a separate patch:

Author: Nikita Pettik <korablev@tarantool.org>
Date:   Mon Jul 29 16:03:38 2019 +0300

    sql: remove OP_SoftNull opcode
    
    It's not used anymore and can be removed to reduce size of codebase.

diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index 9142932e9..12de10f37 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -1226,21 +1226,6 @@ case OP_Null: {           /* out2 */
        break;
 }
 
-/* Opcode: SoftNull P1 * * * *
- * Synopsis: r[P1]=NULL
- *
- * Set register P1 to have the value NULL as seen by the OP_MakeRecord
- * instruction, but do not free any string or blob memory associated with
- * the register, so that if the value was a string or blob that was
- * previously copied using OP_SCopy, the copies will continue to be valid.
- */
-case OP_SoftNull: {
-       assert(pOp->p1>0 && pOp->p1<=(p->nMem+1 - p->nCursor));
-       pOut = &aMem[pOp->p1];
-       pOut->flags = (pOut->flags|MEM_Null)&~MEM_Undefined;
-       break;
-}
-
 /* Opcode: Blob P1 P2 P3 P4 *
  * Synopsis: r[P2]=P4 (len=P1, subtype=P3)
  *

>> @@ -1278,6 +1279,11 @@ case OP_Variable: {            /* out2 */
>> 	}
>> 	pOut = out2Prerelease(p, pOp);
>> 	sqlVdbeMemShallowCopy(pOut, pVar, MEM_Static);
>> +	/*
>> +	 * ...ShallowCopy() may overwrite field_type with irrelevant
>> +	 * value. So make sure that in the end field_type_MAX is set.
>> +	 */
> 
> 4. The only place where ShallowCopy can rewrite field_type is
> memcpy(pTo, pFrom, MEMCELLSIZE). If after that field_type is irrelevant,
> then it was incorrect in pFrom, i.e. in pVar. On the summary:
> why pVar may have incorrect field_type? Should not bind set a correct
> type?

Because it is not set during bind*() routines. Moved field_type_MAX
assignment to vdbeUnbind() routine:

diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index fcc1701eb..20c1e39ec 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -1263,11 +1263,6 @@ case OP_Variable: {            /* out2 */
        }
        pOut = out2Prerelease(p, pOp);
        sqlVdbeMemShallowCopy(pOut, pVar, MEM_Static);
-       /*
-        * ...ShallowCopy() may overwrite field_type with irrelevant
-        * value. So make sure that in the end field_type_MAX is set.
-        */
-       pOut->field_type = field_type_MAX;
        UPDATE_MAX_BLOBSIZE(pOut);
        break;
 }
diff --git a/src/box/sql/vdbeapi.c b/src/box/sql/vdbeapi.c
index c62cb7a1e..d470b6cc8 100644
--- a/src/box/sql/vdbeapi.c
+++ b/src/box/sql/vdbeapi.c
@@ -834,6 +834,7 @@ vdbeUnbind(Vdbe * p, int i)
        pVar = &p->aVar[i];
        sqlVdbeMemRelease(pVar);
        pVar->flags = MEM_Null;
+       pVar->field_type = field_type_MAX;
 
        /* If the bit corresponding to this variable in Vdbe.expmask is set, then
         * binding a new value to this variable invalidates the current query plan.

>> @@ -1338,6 +1344,7 @@ case OP_Copy: {
>> 	pIn1 = &aMem[pOp->p1];
>> 	pOut = &aMem[pOp->p2];
>> 	assert(pOut!=pIn1);
>> +
>> 	while( 1) {
> 
> 5. Unnecessary diff.

Removed.

>> 		sqlVdbeMemShallowCopy(pOut, pIn1, MEM_Ephem);
>> 		Deephemeralize(pOut);
>> @@ -1389,6 +1396,7 @@ case OP_IntCopy: {            /* out2 */
>> 	assert((pIn1->flags & (MEM_Int | MEM_UInt)) != 0);
>> 	pOut = &aMem[pOp->p2];
>> 	mem_set_int(pOut, pIn1->u.i, (pIn1->flags & MEM_Int) != 0);
>> +	pOut->field_type = field_type_MAX;
> 
> 6. Why mem_set_int does not set a correct field_type?

Why should it do so? As its name says, field type refers
to type of filed in space’s format. For value’s type (mp format)
we have “flags" property.

> Or why
> pOut->field_type can't be assigned to pIn->field_type? You actually
> do it for OP_FCopy. The same about OP_Remainder.
> 
> When you assign field_type_MAX, and pIn1 was FIELD_TYPE_INTEGER
> with an unsigned value, you will have typeof(pOut) unsigned - type
> loss.
> 
> Additionally, as I see, OP_IntCopy is used only once, when a space
> is being created, to copy sequence id. In such a case I think it
> can be deleted and replaced with OP_SCopy.

Actually, that’s why I don’t bother with type manipulations.
Replaced OP_IntCopy with OP_SCopy in a separate patch:

Author: Nikita Pettik <korablev@tarantool.org>
Date:   Mon Jul 29 19:08:41 2019 +0300

    sql: remove OP_IntCopy opcode
    
    Its purpose is to copy integer value from one memory cell to another.
    In other words, it is particular case of OP_SCopy instruction. Since it
    is used only during creation of AUTOINCREMENT property, it seems to be
    reasonable to replace it with a bit general OP_SCopy and erase
    OP_IntCopy at all reducing size of SQL codebase.

diff --git a/src/box/sql/build.c b/src/box/sql/build.c
index ada7b5859..a63d2d3be 100644
--- a/src/box/sql/build.c
+++ b/src/box/sql/build.c
@@ -1066,7 +1066,7 @@ emitNewSysSpaceSequenceRecord(Parse *pParse, int reg_space_id, int reg_seq_id,
        sqlVdbeAddOp2(v, OP_SCopy, reg_space_id, first_col + 1);
        
        /* 2. Sequence id  */
-       sqlVdbeAddOp2(v, OP_IntCopy, reg_seq_id, first_col + 2);
+       sqlVdbeAddOp2(v, OP_SCopy, reg_seq_id, first_col + 2);
 
        /* 3. Autogenerated. */
        sqlVdbeAddOp2(v, OP_Bool, true, first_col + 3);
diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index 12de10f37..88d1fa895 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -1361,22 +1361,6 @@ case OP_SCopy: {            /* out2 */
        break;
 }
 
-/* Opcode: IntCopy P1 P2 * * *
- * Synopsis: r[P2]=r[P1]
- *
- * Transfer the integer value held in register P1 into register P2.
- *
- * This is an optimized version of SCopy that works only for integer
- * values.
- */
-case OP_IntCopy: {            /* out2 */
-       pIn1 = &aMem[pOp->p1];
-       assert((pIn1->flags & (MEM_Int | MEM_UInt)) != 0);
-       pOut = &aMem[pOp->p2];
-       mem_set_int(pOut, pIn1->u.i, (pIn1->flags & MEM_Int) != 0);
-       break;
-}
-
 /* Opcode: ResultRow P1 P2 * * *
  * Synopsis: output=r[P1@P2]
  *

> 
>> 	break;
>> }
>> 
>> @@ -1491,6 +1499,8 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
>> 	assert(pIn1!=pOut);
>> 	if ((pIn1->flags | pIn2->flags) & MEM_Null) {
>> 		sqlVdbeMemSetNull(pOut);
>> +		/* Force NULL be of type STRING. */
>> +		pOut->field_type = FIELD_TYPE_STRING;
>> 		break;
>> 	}
>> 	/*
>> @@ -1535,6 +1545,7 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
>> 	pOut->z[nByte+1] = 0;
>> 	pOut->flags |= MEM_Term;
>> 	pOut->n = (int)nByte;
>> +	pOut->field_type = field_type_MAX;
>> 	UPDATE_MAX_BLOBSIZE(pOut);
>> 	break;
>> }
>> @@ -1644,6 +1655,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
>> 		}
>> 		}
>> 		mem_set_int(pOut, iB, is_res_neg);
>> +		pOut->field_type = field_type_MAX;
>> 	} else {
>> 		bIntint = 0;
>> 		if (sqlVdbeRealValue(pIn1, &rA) != 0) {
>> @@ -1681,6 +1693,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
>> 		}
>> 		pOut->u.r = rB;
>> 		MemSetTypeFlag(pOut, MEM_Real);
>> +		pOut->field_type = field_type_MAX;
>> 		if (((type1|type2)&MEM_Real)==0 && !bIntint) {
>> 			mem_apply_integer_type(pOut);
>> 		}
>> @@ -1689,6 +1702,8 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
>> 
>> arithmetic_result_is_null:
>> 	sqlVdbeMemSetNull(pOut);
>> +	/* Force NULL be of type NUMBER. */
>> +	pOut->field_type = FIELD_TYPE_NUMBER;
>> 	break;
> 
> 7. All these manual assignments of field_type are broken by design.
> When you have to update field_type after each operation manually,
> you will miss some places for sure. For example:
> 
> tarantool> box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
> ---
> - row_count: 1
> ...
> 
> tarantool> box.execute('INSERT INTO test VALUES (1, NULL, NULL)')
> ---
> - row_count: 1
> ...
> 
> tarantool> box.execute('SELECT typeof(a & b) FROM test')
> ---
> - metadata:
>  - name: typeof(a & b)
>    type: string
>  rows:
>  - ['null']
> ...
> 
> tarantool> box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM test')
> ---
> - metadata:
>  - name: typeof(a)
>    type: string
>  - name: typeof(b)
>    type: string
>  - name: typeof(a & b)
>    type: string
>  rows:
>  - ['integer', 'integer', 'integer']
> ...
> 
> 
> Here in the last two queries 'typeof(a & b)' result *depends* on
> other columns in result set. But it should be integer always, shouldn't
> it?

Yep, it should, I’ve fixed that:

diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index b3b230ad0..f6eab8a0e 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -1842,6 +1842,8 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
        pOut = &aMem[pOp->p3];
        if ((pIn1->flags | pIn2->flags) & MEM_Null) {
                sqlVdbeMemSetNull(pOut);
+               /* Force NULL be of type INTEGER. */
+               pOut->field_type = FIELD_TYPE_INTEGER;
                break;
        }
        bool unused;
@@ -2477,6 +2479,8 @@ case OP_BitNot: {             /* same as TK_BITNOT, in1, out2 */
        pIn1 = &aMem[pOp->p1];
        pOut = &aMem[pOp->p2];
        sqlVdbeMemSetNull(pOut);
+       /* Force NULL be of type INTEGER. */
+       pOut->field_type = FIELD_TYPE_INTEGER;
        if ((pIn1->flags & MEM_Null)==0) {
                int64_t i;
                bool is_neg;
diff --git a/test/sql/types.test.lua b/test/sql/types.test.lua
index 489a5a91c..8a509da96 100644
--- a/test/sql/types.test.lua
+++ b/test/sql/types.test.lua
@@ -407,6 +407,12 @@ box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);")
 box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);")
 box.execute("SELECT typeof(a), typeof(s) FROM t;")
 
+box.execute('CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
+box.execute('INSERT INTO t1 VALUES (1, NULL, NULL);')
+box.execute('SELECT typeof(a & b) FROM t1;')
+box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM t1')
+
 box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
 
 box.space.T:drop()
+box.space.T1:drop()

> The same about all other places, where you call mem_set_* and don't
> update field_type.

It’s okay, since updating current value in memory cell doesn’t
imply that field type should be updated as well. In most scenarios
before setting new value to memory cell, we provide clean-up
via memAboutToChange(), which in turn invalidates field type.

> I propose you to redesign how to update field_type, so as it would
> be hard to forget to update it.
> 
> For instance, add a field_type argument to sqlVdbeMemSetNull() to never
> miss to set a real type for a NULL value. Make mem_set_int update field_type
> depending on the positivity flag. Make mem_set_u64 always set field_type
> UNSIGNED. Make sqlVdbeMemCast always set field_type to the target type.

I’ve tried to refactor patch this way, but in most cases sqlVdbeMemSetNull()
would set default (field_type_MAX). As for mem_set_* family functions -
I can move assignment of field_type_MAX to the body, but does it make
much sense?

>> @@ -1909,6 +1924,7 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>> 		}
>> 	}
>> 	mem_set_i64(pOut, iA);
>> +	pOut->field_type = field_type_MAX;
>> 	break;
>> }
>> 
>> @@ -1988,6 +2004,13 @@ case OP_Cast: {                  /* in1 */
>> 	if (ExpandBlob(pIn1) != 0)
>> 		goto abort_due_to_error;
>> 	rc = sqlVdbeMemCast(pIn1, pOp->p2);
>> +	/*
>> +	 * SCALAR is not type itself, but rather an aggregation
>> +	 * of types. Hence, cast to this type shouldn't change
>> +	 * original type of argument.
>> +	 */
>> +	if (pOp->p2 != FIELD_TYPE_SCALAR)
>> +		pIn1->field_type = pOp->p2;
> 
> 8. Why sqlVdbeMemCast does not update the type inside? Now
> this function works on half - it updates flags with a new type,
> sets a new value, but doesn't touch field_type.

Because it has several exit points and otherwise we have
to add to each branch:

diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
index e5941de8e..7892af116 100644
--- a/src/box/sql/vdbemem.c
+++ b/src/box/sql/vdbemem.c
@@ -673,10 +673,12 @@ sqlVdbeMemCast(Mem * pMem, enum field_type type)
        case FIELD_TYPE_BOOLEAN:
                if ((pMem->flags & MEM_Int) != 0) {
                        mem_set_bool(pMem, pMem->u.i);
+                       pMem->field_type = type;
                        return 0;
                }
                if ((pMem->flags & MEM_UInt) != 0) {
                        mem_set_bool(pMem, pMem->u.u);
+                       pMem->field_type = type;
                        return 0;
                }
                if ((pMem->flags & MEM_Real) != 0) {

Etc.

What is more, this is the only path (OP_Cast) which calls
sqlVdbeMemCast() function. So IMHO it is pretty OK.

If you want, I can attempt at refactoring sqlVdbeMemCast()
so that make it feature only one exit point and move filed_type
assignment to it.

>> 	UPDATE_MAX_BLOBSIZE(pIn1);
>> 	if (rc == 0)
>> 		break;
>> @@ -2094,6 +2117,8 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
>> 	pIn3 = &aMem[pOp->p3];
>> 	flags1 = pIn1->flags;
>> 	flags3 = pIn3->flags;
>> +	enum field_type ft_p1 = pIn1->field_type;
>> +	enum field_type ft_p3 = pIn3->field_type;
>> 	if ((flags1 | flags3)&MEM_Null) {
>> 		/* One or both operands are NULL */
>> 		if (pOp->p5 & SQL_NULLEQ) {
>> @@ -2232,8 +2257,10 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
>> 	/* Undo any changes made by mem_apply_type() to the input registers. */
>> 	assert((pIn1->flags & MEM_Dyn) == (flags1 & MEM_Dyn));
>> 	pIn1->flags = flags1;
>> +	pIn1->field_type = ft_p1;
>> 	assert((pIn3->flags & MEM_Dyn) == (flags3 & MEM_Dyn));
>> 	pIn3->flags = flags3;
>> +	pIn3->field_type = ft_p3;
>> 
>> 	if (pOp->p5 & SQL_STOREP2) {
>> 		pOut = &aMem[pOp->p2];
>> @@ -2451,6 +2478,7 @@ case OP_Or: {             /* same as TK_OR, in1, in2, out3 */
>> 	} else {
>> 		mem_set_bool(pOut, v1);
>> 	}
>> +	pOut->field_type = field_type_MAX;
>> 	break;
>> }
>> 
>> @@ -2472,6 +2500,7 @@ case OP_Not: {                /* same as TK_NOT, in1, out2 */
>> 			goto abort_due_to_error;
>> 		}
>> 		mem_set_bool(pOut, ! pIn1->u.b);
>> +		pOut->field_type = field_type_MAX;
> 
> 9. So after mem_set_bool, if the cell was used for
> something else before, it is in an invalid state, right?

Why do you call it ‘invalid’? In this approach any operation
resets field type to default value to force typeof() look
at actual mp format type. Since in this case field_type simply
duplicates format type. The only exception is null values due
to storing is_null flag alongside with format type. 

> I don't think it should work so. Please, develop a
> better way how to keep field_type up to date.
> 
> Ideally, if code doesn't touch type flags, it should not touch
> field_type. Otherwise that place is broken, if you need to
> do follow-up updates after each function, doing something
> with types.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-07-31  9:14     ` n.pettik
@ 2019-07-31 16:59       ` Vladislav Shpilevoy
  2019-08-01 15:56         ` n.pettik
  0 siblings, 1 reply; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-07-31 16:59 UTC (permalink / raw)
  To: n.pettik, tarantool-patches

Hi! Thanks for the fixes! See 5 comments below.

On 31/07/2019 11:14, n.pettik wrote:
> 
> 
>>> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
>>> index fcf147c96..f50df105d 100644
>>> --- a/src/box/sql/func.c
>>> +++ b/src/box/sql/func.c
>>> @@ -107,6 +107,17 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv)
>>> {
>>> 	const char *z = 0;
>>> 	UNUSED_PARAMETER(NotUsed);
>>> +	enum field_type f_t = argv[0]->field_type;
>>
>> 1. Perhaps, field_type is not the best name, because Mem is not always
>> is a field. It can be just a value, not from a space. It is rather
>> just 'type’.
> 
> When memory represents smth different from value fetched from
> tuple, it is set to field_type_MAX. So I’d rather not rename this member.

1. Ok, now I got it. It works only for tuple fields, and is totally disabled
for other values. Or not? See other comments.

>> 7. All these manual assignments of field_type are broken by design.
>> When you have to update field_type after each operation manually,
>> you will miss some places for sure. For example:
>>
>> tarantool> box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
>> ---
>> - row_count: 1
>> ...
>>
>> tarantool> box.execute('INSERT INTO test VALUES (1, NULL, NULL)')
>> ---
>> - row_count: 1
>> ...
>>
>> tarantool> box.execute('SELECT typeof(a & b) FROM test')
>> ---
>> - metadata:
>>  - name: typeof(a & b)
>>    type: string
>>  rows:
>>  - ['null']
>> ...
>>
>> tarantool> box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM test')
>> ---
>> - metadata:
>>  - name: typeof(a)
>>    type: string
>>  - name: typeof(b)
>>    type: string
>>  - name: typeof(a & b)
>>    type: string
>>  rows:
>>  - ['integer', 'integer', 'integer']
>> ...
>>
>>
>> Here in the last two queries 'typeof(a & b)' result *depends* on
>> other columns in result set. But it should be integer always, shouldn't
>> it?
> 
> Yep, it should, I’ve fixed that:
> 
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index b3b230ad0..f6eab8a0e 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -1842,6 +1842,8 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>         pOut = &aMem[pOp->p3];
>         if ((pIn1->flags | pIn2->flags) & MEM_Null) {
>                 sqlVdbeMemSetNull(pOut);
> +               /* Force NULL be of type INTEGER. */
> +               pOut->field_type = FIELD_TYPE_INTEGER;
>                 break;
>         }
>         bool unused;
> @@ -2477,6 +2479,8 @@ case OP_BitNot: {             /* same as TK_BITNOT, in1, out2 */
>         pIn1 = &aMem[pOp->p1];
>         pOut = &aMem[pOp->p2];
>         sqlVdbeMemSetNull(pOut);
> +       /* Force NULL be of type INTEGER. */
> +       pOut->field_type = FIELD_TYPE_INTEGER;
2.

tarantool> box.execute('SELECT typeof(~NULL)')
---
- metadata:
  - name: typeof(~NULL)
    type: string
  rows:
  - ['integer']
...

But it is not integer, as I think. I didn't select NULL from
an integer column. Here NULL was of type 'boolean'. And the
result should be 'boolean' too, because it is just NULL, not
a column NULL. The same about other bit operations.

And the same about strings:

tarantool> box.execute("SELECT typeof(NULL || NULL)")
---
- metadata:
  - name: typeof(NULL || NULL)
    type: string
  rows:
  - ['string']
...

Here NULLs were of type 'boolean', but output shows 'string'.

>         if ((pIn1->flags & MEM_Null)==0) {
>                 int64_t i;
>                 bool is_neg;
> diff --git a/test/sql/types.test.lua b/test/sql/types.test.lua
> index 489a5a91c..8a509da96 100644
> --- a/test/sql/types.test.lua
> +++ b/test/sql/types.test.lua
> @@ -407,6 +407,12 @@ box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);")
>  box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);")
>  box.execute("SELECT typeof(a), typeof(s) FROM t;")
>  
> +box.execute('CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
> +box.execute('INSERT INTO t1 VALUES (1, NULL, NULL);')
> +box.execute('SELECT typeof(a & b) FROM t1;')
> +box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM t1')
> +
>  box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
>  
>  box.space.T:drop()
> +box.space.T1:drop()
> 
>> The same about all other places, where you call mem_set_* and don't
>> update field_type.
> 
> It’s okay, since updating current value in memory cell doesn’t
> imply that field type should be updated as well. In most scenarios
> before setting new value to memory cell, we provide clean-up
> via memAboutToChange(), which in turn invalidates field type.

3. memAboutToChange is no-op in Release build, so it does not invalidate
anything in that mode. It means, that field_type now can be a garbage
from a previous value, if the memory cell was reused to assign something
new.

>> I propose you to redesign how to update field_type, so as it would
>> be hard to forget to update it.
>>
>> For instance, add a field_type argument to sqlVdbeMemSetNull() to never
>> miss to set a real type for a NULL value. Make mem_set_int update field_type
>> depending on the positivity flag. Make mem_set_u64 always set field_type
>> UNSIGNED. Make sqlVdbeMemCast always set field_type to the target type.
> 
> I’ve tried to refactor patch this way, but in most cases sqlVdbeMemSetNull()
> would set default (field_type_MAX). As for mem_set_* family functions -
> I can move assignment of field_type_MAX to the body, but does it make
> much sense?
> 

4. Currently, as I said above, I don't see where you invalidate type of a
memory cell before it takes a new value. Yes, field_type is initialized
with field_type_MAX when the register array is just created, but where
the cell type is reset, when it takes a new value?

And one another possible bug:

box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER)')
box.execute('INSERT INTO test VALUES (1, NULL)')
tarantool> box.execute('SELECT typeof(NOT a) FROM test')
---
- metadata:
  - name: typeof(NOT a)
    type: string
  rows:
  - ['boolean']
...

As you can see, 'NOT' lost the original type - it was integer NULL,
not boolean NULL.

Another place is vdbe.c:2627 - here you jump to op_column_out
bypassing field_type assignment. Why?

>>> @@ -1909,6 +1924,7 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>>> 		}
>>> 	}
>>> 	mem_set_i64(pOut, iA);
>>> +	pOut->field_type = field_type_MAX;
>>> 	break;
>>> }
>>>
>>> @@ -1988,6 +2004,13 @@ case OP_Cast: {                  /* in1 */
>>> 	if (ExpandBlob(pIn1) != 0)
>>> 		goto abort_due_to_error;
>>> 	rc = sqlVdbeMemCast(pIn1, pOp->p2);
>>> +	/*
>>> +	 * SCALAR is not type itself, but rather an aggregation
>>> +	 * of types. Hence, cast to this type shouldn't change
>>> +	 * original type of argument.
>>> +	 */
>>> +	if (pOp->p2 != FIELD_TYPE_SCALAR)
>>> +		pIn1->field_type = pOp->p2;
>>
>> 8. Why sqlVdbeMemCast does not update the type inside? Now
>> this function works on half - it updates flags with a new type,
>> sets a new value, but doesn't touch field_type.
> 
> Because it has several exit points and otherwise we have
> to add to each branch:
> 
> diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
> index e5941de8e..7892af116 100644
> --- a/src/box/sql/vdbemem.c
> +++ b/src/box/sql/vdbemem.c
> @@ -673,10 +673,12 @@ sqlVdbeMemCast(Mem * pMem, enum field_type type)
>         case FIELD_TYPE_BOOLEAN:
>                 if ((pMem->flags & MEM_Int) != 0) {
>                         mem_set_bool(pMem, pMem->u.i);
> +                       pMem->field_type = type;
>                         return 0;
>                 }
>                 if ((pMem->flags & MEM_UInt) != 0) {
>                         mem_set_bool(pMem, pMem->u.u);
> +                       pMem->field_type = type;
>                         return 0;
>                 }
>                 if ((pMem->flags & MEM_Real) != 0) {
> 
> Etc.
> 
> What is more, this is the only path (OP_Cast) which calls
> sqlVdbeMemCast() function. So IMHO it is pretty OK.
> 
> If you want, I can attempt at refactoring sqlVdbeMemCast()
> so that make it feature only one exit point and move filed_type
> assignment to it.
> 

5. Never mind. Too many changes.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-07-31 16:59       ` Vladislav Shpilevoy
@ 2019-08-01 15:56         ` n.pettik
  2019-08-01 23:11           ` Vladislav Shpilevoy
  0 siblings, 1 reply; 11+ messages in thread
From: n.pettik @ 2019-08-01 15:56 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Vladislav Shpilevoy



> On 31 Jul 2019, at 19:59, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> wrote:
> Hi! Thanks for the fixes! See 5 comments below.
> On 31/07/2019 11:14, n.pettik wrote:
>> 
>>>> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
>>>> index fcf147c96..f50df105d 100644
>>>> --- a/src/box/sql/func.c
>>>> +++ b/src/box/sql/func.c
>>>> @@ -107,6 +107,17 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv)
>>>> {
>>>> 	const char *z = 0;
>>>> 	UNUSED_PARAMETER(NotUsed);
>>>> +	enum field_type f_t = argv[0]->field_type;
>>> 
>>> 1. Perhaps, field_type is not the best name, because Mem is not always
>>> is a field. It can be just a value, not from a space. It is rather
>>> just 'type’.
>> 
>> When memory represents smth different from value fetched from
>> tuple, it is set to field_type_MAX. So I’d rather not rename this member.
> 
> 1. Ok, now I got it. It works only for tuple fields, and is totally disabled
> for other values. Or not? See other comments.

Sort of. The only addition to this rule is CAST operation to
display appropriate type for NULL values.

>>> 7. All these manual assignments of field_type are broken by design.
>>> When you have to update field_type after each operation manually,
>>> you will miss some places for sure. For example:
>>> 
>>> tarantool> box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
>>> ---
>>> - row_count: 1
>>> ...
>>> 
>>> tarantool> box.execute('INSERT INTO test VALUES (1, NULL, NULL)')
>>> ---
>>> - row_count: 1
>>> ...
>>> 
>>> tarantool> box.execute('SELECT typeof(a & b) FROM test')
>>> ---
>>> - metadata:
>>> - name: typeof(a & b)
>>>   type: string
>>> rows:
>>> - ['null']
>>> ...
>>> 
>>> tarantool> box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM test')
>>> ---
>>> - metadata:
>>> - name: typeof(a)
>>>   type: string
>>> - name: typeof(b)
>>>   type: string
>>> - name: typeof(a & b)
>>>   type: string
>>> rows:
>>> - ['integer', 'integer', 'integer']
>>> ...
>>> 
>>> 
>>> Here in the last two queries 'typeof(a & b)' result *depends* on
>>> other columns in result set. But it should be integer always, shouldn't
>>> it?
>> 
>> Yep, it should, I’ve fixed that:
>> 
>> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
>> index b3b230ad0..f6eab8a0e 100644
>> --- a/src/box/sql/vdbe.c
>> +++ b/src/box/sql/vdbe.c
>> @@ -1842,6 +1842,8 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>>        pOut = &aMem[pOp->p3];
>>        if ((pIn1->flags | pIn2->flags) & MEM_Null) {
>>                sqlVdbeMemSetNull(pOut);
>> +               /* Force NULL be of type INTEGER. */
>> +               pOut->field_type = FIELD_TYPE_INTEGER;
>>                break;
>>        }
>>        bool unused;
>> @@ -2477,6 +2479,8 @@ case OP_BitNot: {             /* same as TK_BITNOT, in1, out2 */
>>        pIn1 = &aMem[pOp->p1];
>>        pOut = &aMem[pOp->p2];
>>        sqlVdbeMemSetNull(pOut);
>> +       /* Force NULL be of type INTEGER. */
>> +       pOut->field_type = FIELD_TYPE_INTEGER;
> 2.
> 
> tarantool> box.execute('SELECT typeof(~NULL)')
> ---
> - metadata:
>  - name: typeof(~NULL)
>    type: string
>  rows:
>  - ['integer']
> ...
> 
> But it is not integer, as I think. I didn't select NULL from
> an integer column. Here NULL was of type 'boolean'. And the
> result should be 'boolean' too, because it is just NULL, not
> a column NULL. The same about other bit operations.

I guess it is exactly what I was asked to do. “Bitnot” operation
never returns “boolean” result - it’s argument is checked to be
of type integer. Hence, IMHO if it returned “boolean” even for
NULL values it would make users wonder why it does so. Anyway,
just in case I’m going to re-ask Konstantin and Peter is this
what we want or not.

> And the same about strings:
> 
> tarantool> box.execute("SELECT typeof(NULL || NULL)")
> ---
> - metadata:
>  - name: typeof(NULL || NULL)
>    type: string
>  rows:
>  - ['string']
> ...
> 
> Here NULLs were of type 'boolean', but output shows 'string’.

The same logic is applied for concatenation, bitwise and
arithmetic operations.

>> diff --git a/test/sql/types.test.lua b/test/sql/types.test.lua
>> index 489a5a91c..8a509da96 100644
>> --- a/test/sql/types.test.lua
>> +++ b/test/sql/types.test.lua
>> @@ -407,6 +407,12 @@ box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);")
>> box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);")
>> box.execute("SELECT typeof(a), typeof(s) FROM t;")
>> 
>> +box.execute('CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
>> +box.execute('INSERT INTO t1 VALUES (1, NULL, NULL);')
>> +box.execute('SELECT typeof(a & b) FROM t1;')
>> +box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM t1')
>> +
>> box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
>> 
>> box.space.T:drop()
>> +box.space.T1:drop()
>> 
>>> The same about all other places, where you call mem_set_* and don't
>>> update field_type.
>> 
>> It’s okay, since updating current value in memory cell doesn’t
>> imply that field type should be updated as well. In most scenarios
>> before setting new value to memory cell, we provide clean-up
>> via memAboutToChange(), which in turn invalidates field type.
> 
> 3. memAboutToChange is no-op in Release build, so it does not invalidate
> anything in that mode. It means, that field_type now can be a garbage
> from a previous value, if the memory cell was reused to assign something
> new.

Oops, I moved field type resetting to out2prerelease() and added
call of this function to several places:

diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index f6eab8a0e..77529d346 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -558,6 +558,7 @@ out2Prerelease(Vdbe *p, VdbeOp *pOp)
                pOut->flags = MEM_Int;
                return pOut;
        }
+       pOut->field_type = field_type_MAX;
 }
 
 struct stailq *
@@ -1781,6 +1782,7 @@ case OP_Function: {
        }
 #endif
        MemSetTypeFlag(pCtx->pOut, MEM_Null);
+       pCtx->pOut->field_type = field_type_MAX;
        pCtx->is_aborted = false;
        (*pCtx->pFunc->xSFunc)(pCtx, pCtx->argc, pCtx->argv);/* IMP: R-24505-23230 */
 
@@ -2227,6 +2229,7 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
 
        if (pOp->p5 & SQL_STOREP2) {
                pOut = &aMem[pOp->p2];
+               pOut->field_type = field_type_MAX;
                iCompare = res;
                res2 = res2!=0;  /* For this path res2 must be exactly 0 or 1 */
                if ((pOp->p5 & SQL_KEEPNULL)!=0) {
@@ -2625,6 +2628,7 @@ case OP_Column: {
                                                            pReg->z, pReg->n);
                        } else {
                                sqlVdbeMemSetNull(pDest);
+                               pDest->field_type = field_type_MAX;
                                goto op_column_out;
                        }
                } else {
@@ -3684,7 +3688,7 @@ case OP_Sequence: {           /* out2 */
  * incremented by one.
  */
 case OP_NextSequenceId: {
-       pOut = &aMem[pOp->p2];
+       pOut = out2Prerelease(p, pOp);
        uint64_t id;
        tarantoolSqlNextSeqId(&id);
        id++;
@@ -3717,7 +3721,7 @@ case OP_NextIdEphemeral: {
                diag_set(ClientError, ER_ROWID_OVERFLOW);
                goto abort_due_to_error;
        }
-       pOut = &aMem[pOp->p2];
+       pOut = out2Prerelease(p, pOp);
        mem_set_u64(pOut, rowid);
        break;
 }
@@ -3925,7 +3929,7 @@ case OP_RowData: {
        }
 #endif
 
-       pOut = &aMem[pOp->p2];
+       pOut = out2Prerelease(p, pOp);
        memAboutToChange(p, pOut);
 
        assert(pOp->p1>=0 && pOp->p1<p->nCursor);
diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
index e5941de8e..bf020657f 100644
--- a/src/box/sql/vdbemem.c
+++ b/src/box/sql/vdbemem.c
@@ -894,7 +894,6 @@ sqlVdbeMemAboutToChange(Vdbe * pVdbe, Mem * pMem)
        for (i = 0, pX = pVdbe->aMem; i < pVdbe->nMem; i++, pX++) {
                if (pX->pScopyFrom == pMem) {
                        pX->flags |= MEM_Undefined;
-                       pX->field_type = field_type_MAX;
                        pX->pScopyFrom = 0;
                }
        }

>>> I propose you to redesign how to update field_type, so as it would
>>> be hard to forget to update it.
>>> 
>>> For instance, add a field_type argument to sqlVdbeMemSetNull() to never
>>> miss to set a real type for a NULL value. Make mem_set_int update field_type
>>> depending on the positivity flag. Make mem_set_u64 always set field_type
>>> UNSIGNED. Make sqlVdbeMemCast always set field_type to the target type.
>> 
>> I’ve tried to refactor patch this way, but in most cases sqlVdbeMemSetNull()
>> would set default (field_type_MAX). As for mem_set_* family functions -
>> I can move assignment of field_type_MAX to the body, but does it make
>> much sense?
>> 
> 
> 4. Currently, as I said above, I don't see where you invalidate type of a
> memory cell before it takes a new value. Yes, field_type is initialized
> with field_type_MAX when the register array is just created, but where
> the cell type is reset, when it takes a new value?
> 
> And one another possible bug:
> 
> box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER)')
> box.execute('INSERT INTO test VALUES (1, NULL)')
> tarantool> box.execute('SELECT typeof(NOT a) FROM test')
> ---
> - metadata:
>  - name: typeof(NOT a)
>    type: string
>  rows:
>  - ['boolean']
> ...
> 
> As you can see, 'NOT' lost the original type - it was integer NULL,
> not boolean NULL.

Again, personally I do not consider this to be a bug
(see a bit detailed explanation above).

> Another place is vdbe.c:2627 - here you jump to op_column_out
> bypassing field_type assignment. Why?

Simply missed this place, fixed:

@@ -2625,6 +2628,7 @@ case OP_Column: {
                                                            pReg->z, pReg->n);
                        } else {
                                sqlVdbeMemSetNull(pDest);
+                               pDest->field_type = field_type_MAX;
                                goto op_column_out;
                        }
                } else {


>>>> @@ -1909,6 +1924,7 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>>>> 		}
>>>> 	}
>>>> 	mem_set_i64(pOut, iA);
>>>> +	pOut->field_type = field_type_MAX;
>>>> 	break;
>>>> }
>>>> 
>>>> @@ -1988,6 +2004,13 @@ case OP_Cast: {                  /* in1 */
>>>> 	if (ExpandBlob(pIn1) != 0)
>>>> 		goto abort_due_to_error;
>>>> 	rc = sqlVdbeMemCast(pIn1, pOp->p2);
>>>> +	/*
>>>> +	 * SCALAR is not type itself, but rather an aggregation
>>>> +	 * of types. Hence, cast to this type shouldn't change
>>>> +	 * original type of argument.
>>>> +	 */
>>>> +	if (pOp->p2 != FIELD_TYPE_SCALAR)
>>>> +		pIn1->field_type = pOp->p2;
>>> 
>>> 8. Why sqlVdbeMemCast does not update the type inside? Now
>>> this function works on half - it updates flags with a new type,
>>> sets a new value, but doesn't touch field_type.
>> 
>> Because it has several exit points and otherwise we have
>> to add to each branch:
>> 
>> diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
>> index e5941de8e..7892af116 100644
>> --- a/src/box/sql/vdbemem.c
>> +++ b/src/box/sql/vdbemem.c
>> @@ -673,10 +673,12 @@ sqlVdbeMemCast(Mem * pMem, enum field_type type)
>>        case FIELD_TYPE_BOOLEAN:
>>                if ((pMem->flags & MEM_Int) != 0) {
>>                        mem_set_bool(pMem, pMem->u.i);
>> +                       pMem->field_type = type;
>>                        return 0;
>>                }
>>                if ((pMem->flags & MEM_UInt) != 0) {
>>                        mem_set_bool(pMem, pMem->u.u);
>> +                       pMem->field_type = type;
>>                        return 0;
>>                }
>>                if ((pMem->flags & MEM_Real) != 0) {
>> 
>> Etc.
>> 
>> What is more, this is the only path (OP_Cast) which calls
>> sqlVdbeMemCast() function. So IMHO it is pretty OK.
>> 
>> If you want, I can attempt at refactoring sqlVdbeMemCast()
>> so that make it feature only one exit point and move filed_type
>> assignment to it.
>> 
> 
> 5. Never mind. Too many changes.

I’m not sure if it is a sarcasm or not...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-08-01 15:56         ` n.pettik
@ 2019-08-01 23:11           ` Vladislav Shpilevoy
  2019-08-02 10:02             ` n.pettik
  0 siblings, 1 reply; 11+ messages in thread
From: Vladislav Shpilevoy @ 2019-08-01 23:11 UTC (permalink / raw)
  To: n.pettik, tarantool-patches


>>> diff --git a/test/sql/types.test.lua b/test/sql/types.test.lua
>>> index 489a5a91c..8a509da96 100644
>>> --- a/test/sql/types.test.lua
>>> +++ b/test/sql/types.test.lua
>>> @@ -407,6 +407,12 @@ box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);")
>>> box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);")
>>> box.execute("SELECT typeof(a), typeof(s) FROM t;")
>>>
>>> +box.execute('CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
>>> +box.execute('INSERT INTO t1 VALUES (1, NULL, NULL);')
>>> +box.execute('SELECT typeof(a & b) FROM t1;')
>>> +box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM t1')
>>> +
>>> box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
>>>
>>> box.space.T:drop()
>>> +box.space.T1:drop()
>>>
>>>> The same about all other places, where you call mem_set_* and don't
>>>> update field_type.
>>>
>>> It’s okay, since updating current value in memory cell doesn’t
>>> imply that field type should be updated as well. In most scenarios
>>> before setting new value to memory cell, we provide clean-up
>>> via memAboutToChange(), which in turn invalidates field type.
>>
>> 3. memAboutToChange is no-op in Release build, so it does not invalidate
>> anything in that mode. It means, that field_type now can be a garbage
>> from a previous value, if the memory cell was reused to assign something
>> new.
> 
> Oops, I moved field type resetting to out2prerelease() and added
> call of this function to several places.

Another missed place: vdbe.c:2110.
Also if on line vdbe.c:2462 the condition is false,
then pOut field_type can be a garbage from a previous
value. What is strange, for OP_BitNot you set field_type INTEGER
unconditionally, but for OP_Not you set it to MAX instead of BOOLEAN,
and conditionally. One of these places is definitely a bug.

In OP_MakeRecord pOut also does not set field_type, even to MAX.
There is sqlVdbeMemRelease, but it is noop if the mem didn't store
a dynamically allocated value.

Additionally, you put field_type invalidation in out2prerelease to
an unreachable path, after all returns.

Finally, OP_Fetch does not set field type even when a field is got
from a space tuple - it is used in checks:

box.cfg{}
box.execute("CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER, CHECK(typeof(a) == 'integer'))")
box.execute("INSERT INTO test VALUES (1, 1)")
box.execute("INSERT INTO test VALUES (2, NULL)")
---
- error: 'Check constraint failed ''CK_CONSTRAINT_1_TEST'': typeof(a) == ''integer'''
...


There are still more bugs, but since 2.2 is coming, I've finished the
patch.

>>>
>>> What is more, this is the only path (OP_Cast) which calls
>>> sqlVdbeMemCast() function. So IMHO it is pretty OK.
>>>
>>> If you want, I can attempt at refactoring sqlVdbeMemCast()
>>> so that make it feature only one exit point and move filed_type
>>> assignment to it.
>>>
>>
>> 5. Never mind. Too many changes.
> 
> I’m not sure if it is a sarcasm or not...
> 

It is not. There are really too many changes, and in addition now I
think, that it is a good idea to keep all field_type updates in vdbe.c
since it can be set to a column only. And only vdbe.c fetches columns.
I don't remember using sarcasm in my reviews btw (except sometimes in
titles).

My review fixes, on top of the branch. If you apply them, then LGTM.
Otherwise give me your comments, and we will continue the work (it
won't be LGTM then, *Kirill*, don't push!).

========================================================================

commit 6a01ec5219cdc8465e37713c6092b56440421258
Author: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Date:   Thu Aug 1 23:52:07 2019 +0200

    Review fixes

diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index 77529d346..143278a67 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -532,33 +532,16 @@ registerTrace(int iReg, Mem *p) {
 
 #endif
 
-/*
- * Return the register of pOp->p2 after first preparing it to be
- * overwritten with an integer value.
- */
-static SQL_NOINLINE Mem *
-out2PrereleaseWithClear(Mem *pOut)
-{
-	sqlVdbeMemSetNull(pOut);
-	pOut->flags = MEM_Int;
-	return pOut;
-}
-
-static Mem *
-out2Prerelease(Vdbe *p, VdbeOp *pOp)
+static struct Mem *
+vdbe_prepare_null_out(struct Vdbe *v, int n)
 {
-	Mem *pOut;
-	assert(pOp->p2>0);
-	assert(pOp->p2<=(p->nMem+1 - p->nCursor));
-	pOut = &p->aMem[pOp->p2];
-	memAboutToChange(p, pOut);
-	if (VdbeMemDynamic(pOut)) { /*OPTIMIZATION-IF-FALSE*/
-		return out2PrereleaseWithClear(pOut);
-	} else {
-		pOut->flags = MEM_Int;
-		return pOut;
-	}
-	pOut->field_type = field_type_MAX;
+	assert(n > 0);
+	assert(n <= (v->nMem + 1 - v->nCursor));
+	struct Mem *out = &v->aMem[n];
+	memAboutToChange(v, out);
+	sqlVdbeMemSetNull(out);
+	out->field_type = field_type_MAX;
+	return out;
 }
 
 struct stailq *
@@ -607,6 +590,17 @@ mem_type_to_str(const struct Mem *p)
 	}
 }
 
+static inline const struct tuple_field *
+vdbe_field_ref_fetch_field(struct vdbe_field_ref *field_ref, uint32_t fieldno)
+{
+	if (field_ref->tuple == NULL)
+		return NULL;
+	struct tuple_format *format = tuple_format(field_ref->tuple);
+	if (fieldno >= tuple_format_field_count(format))
+		return NULL;
+	return tuple_format_field(format, fieldno);
+}
+
 /**
  * Try to get a current tuple field using its field map.
  * @param field_ref The vdbe_field_ref instance to use.
@@ -620,12 +614,9 @@ static const void *
 vdbe_field_ref_fast_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno,
 			  uint32_t *field_size)
 {
-	if (field_ref->tuple == NULL)
-		return NULL;
-	struct tuple_format *format = tuple_format(field_ref->tuple);
-	if (fieldno >= tuple_format_field_count(format) ||
-	    tuple_format_field(format, fieldno)->offset_slot ==
-	    TUPLE_OFFSET_SLOT_NIL)
+	const struct tuple_field *tf =
+		vdbe_field_ref_fetch_field(field_ref, fieldno);
+	if (tf == NULL || tf->offset_slot == TUPLE_OFFSET_SLOT_NIL)
 		return NULL;
 	const char *field = tuple_field(field_ref->tuple, fieldno);
 	const char *end = field;
@@ -634,6 +625,16 @@ vdbe_field_ref_fast_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno,
 	return field;
 }
 
+static inline enum field_type
+vdbe_field_ref_fetch_type(struct vdbe_field_ref *field_ref, uint32_t fieldno)
+{
+	const struct tuple_field *tf =
+		vdbe_field_ref_fetch_field(field_ref, fieldno);
+	if (tf == NULL || tf->type == FIELD_TYPE_ANY)
+		return field_type_MAX;
+	return tf->type;
+}
+
 /**
  * Fetch field by fieldno using vdbe_field_ref and store result
  * in dest_mem.
@@ -647,7 +648,6 @@ static int
 vdbe_field_ref_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno,
 		     struct Mem *dest_mem)
 {
-	sqlVdbeMemSetNull(dest_mem);
 	uint32_t *slots = field_ref->slots;
 	if (fieldno >= field_ref->field_count) {
 		UPDATE_MAX_BLOBSIZE(dest_mem);
@@ -719,6 +719,7 @@ vdbe_field_ref_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno,
 		dest_mem->flags |= MEM_Term;
 	}
 	UPDATE_MAX_BLOBSIZE(dest_mem);
+	dest_mem->field_type = vdbe_field_ref_fetch_type(field_ref, fieldno);
 	return 0;
 }
 
@@ -1076,13 +1077,8 @@ case OP_Halt: {
  * The 32-bit integer value P1 is written into register P2.
  */
 case OP_Integer: {         /* out2 */
-	pOut = out2Prerelease(p, pOp);
-	if (pOp->p1 < 0) {
-		pOut->u.i = pOp->p1;
-		assert((pOut->flags & MEM_Int) != 0);
-	} else {
-		mem_set_u64(pOut, pOp->p1);
-	}
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
+	mem_set_i64(pOut, pOp->p1);
 	break;
 }
 
@@ -1092,7 +1088,7 @@ case OP_Integer: {         /* out2 */
  * The boolean value P1 is written into register P2.
  */
 case OP_Bool: {         /* out2 */
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	assert(pOp->p1 == 1 || pOp->p1 == 0);
 	mem_set_bool(pOut, pOp->p1);
 	break;
@@ -1105,7 +1101,7 @@ case OP_Bool: {         /* out2 */
  * Write that value into register P2.
  */
 case OP_Int64: {           /* out2 */
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	assert(pOp->p4.pI64!=0);
 	mem_set_int(pOut, *pOp->p4.pI64, pOp->p4type == P4_INT64);
 	break;
@@ -1118,7 +1114,7 @@ case OP_Int64: {           /* out2 */
  * Write that value into register P2.
  */
 case OP_Real: {            /* same as TK_FLOAT, out2 */
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	pOut->flags = MEM_Real;
 	assert(!sqlIsNaN(*pOp->p4.pReal));
 	pOut->u.r = *pOp->p4.pReal;
@@ -1135,7 +1131,6 @@ case OP_Real: {            /* same as TK_FLOAT, out2 */
  */
 case OP_String8: {         /* same as TK_STRING, out2 */
 	assert(pOp->p4.z!=0);
-	pOut = out2Prerelease(p, pOp);
 	pOp->opcode = OP_String;
 	pOp->p1 = sqlStrlen30(pOp->p4.z);
 
@@ -1161,7 +1156,7 @@ case OP_String8: {         /* same as TK_STRING, out2 */
  */
 case OP_String: {          /* out2 */
 	assert(pOp->p4.z!=0);
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	pOut->flags = MEM_Str|MEM_Static|MEM_Term;
 	pOut->z = pOp->p4.z;
 	pOut->n = pOp->p1;
@@ -1190,7 +1185,7 @@ case OP_NextAutoincValue: {
 	if (sequence == NULL || sequence_next(sequence, &value) != 0)
 		goto abort_due_to_error;
 
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	mem_set_i64(pOut, value);
 
 	break;
@@ -1211,7 +1206,7 @@ case OP_NextAutoincValue: {
 case OP_Null: {           /* out2 */
 	int cnt;
 	u16 nullFlag;
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	cnt = pOp->p3-pOp->p2;
 	assert(pOp->p3<=(p->nMem+1 - p->nCursor));
 	pOut->flags = nullFlag = pOp->p1 ? (MEM_Null|MEM_Cleared) : MEM_Null;
@@ -1221,6 +1216,7 @@ case OP_Null: {           /* out2 */
 		memAboutToChange(p, pOut);
 		sqlVdbeMemSetNull(pOut);
 		pOut->flags = nullFlag;
+		pOut->field_type = field_type_MAX;
 		pOut->n = 0;
 		cnt--;
 	}
@@ -1235,7 +1231,7 @@ case OP_Null: {           /* out2 */
  */
 case OP_Blob: {                /* out2 */
 	assert(pOp->p1 <= SQL_MAX_LENGTH);
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	sqlVdbeMemSetStr(pOut, pOp->p4.z, pOp->p1, 0, 0);
 	if (pOp->p3!=0) {
 		pOut->flags |= MEM_Subtype;
@@ -1262,7 +1258,7 @@ case OP_Variable: {            /* out2 */
 	if (sqlVdbeMemTooBig(pVar)) {
 		goto too_big;
 	}
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	sqlVdbeMemShallowCopy(pOut, pVar, MEM_Static);
 	UPDATE_MAX_BLOBSIZE(pOut);
 	break;
@@ -1457,10 +1453,9 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
 
 	pIn1 = &aMem[pOp->p1];
 	pIn2 = &aMem[pOp->p2];
-	pOut = &aMem[pOp->p3];
+	pOut = vdbe_prepare_null_out(p, pOp->p3);
 	assert(pIn1!=pOut);
 	if ((pIn1->flags | pIn2->flags) & MEM_Null) {
-		sqlVdbeMemSetNull(pOut);
 		/* Force NULL be of type STRING. */
 		pOut->field_type = FIELD_TYPE_STRING;
 		break;
@@ -1507,7 +1502,6 @@ case OP_Concat: {           /* same as TK_CONCAT, in1, in2, out3 */
 	pOut->z[nByte+1] = 0;
 	pOut->flags |= MEM_Term;
 	pOut->n = (int)nByte;
-	pOut->field_type = field_type_MAX;
 	UPDATE_MAX_BLOBSIZE(pOut);
 	break;
 }
@@ -1568,7 +1562,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
 	type1 = numericType(pIn1);
 	pIn2 = &aMem[pOp->p2];
 	type2 = numericType(pIn2);
-	pOut = &aMem[pOp->p3];
+	pOut = vdbe_prepare_null_out(p, pOp->p3);
 	flags = pIn1->flags | pIn2->flags;
 	if ((flags & MEM_Null)!=0) goto arithmetic_result_is_null;
 	if ((type1 & (MEM_Int | MEM_UInt)) != 0 &&
@@ -1617,7 +1611,6 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
 		}
 		}
 		mem_set_int(pOut, iB, is_res_neg);
-		pOut->field_type = field_type_MAX;
 	} else {
 		bIntint = 0;
 		if (sqlVdbeRealValue(pIn1, &rA) != 0) {
@@ -1655,7 +1648,6 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
 		}
 		pOut->u.r = rB;
 		MemSetTypeFlag(pOut, MEM_Real);
-		pOut->field_type = field_type_MAX;
 		if (((type1|type2)&MEM_Real)==0 && !bIntint) {
 			mem_apply_integer_type(pOut);
 		}
@@ -1663,7 +1655,6 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
 	break;
 
 arithmetic_result_is_null:
-	sqlVdbeMemSetNull(pOut);
 	/* Force NULL be of type NUMBER. */
 	pOut->field_type = FIELD_TYPE_NUMBER;
 	break;
@@ -1768,21 +1759,18 @@ case OP_Function: {
 	 * checks to see if the register array has changed, and if so it
 	 * reinitializes the relavant parts of the sql_context object
 	 */
-	pOut = &aMem[pOp->p3];
+	pOut = vdbe_prepare_null_out(p, pOp->p3);
 	if (pCtx->pOut != pOut) {
 		pCtx->pOut = pOut;
 		for(i=pCtx->argc-1; i>=0; i--) pCtx->argv[i] = &aMem[pOp->p2+i];
 	}
 
-	memAboutToChange(p, pCtx->pOut);
 #ifdef SQL_DEBUG
 	for(i=0; i<pCtx->argc; i++) {
 		assert(memIsValid(pCtx->argv[i]));
 		REGISTER_TRACE(p, pOp->p2+i, pCtx->argv[i]);
 	}
 #endif
-	MemSetTypeFlag(pCtx->pOut, MEM_Null);
-	pCtx->pOut->field_type = field_type_MAX;
 	pCtx->is_aborted = false;
 	(*pCtx->pFunc->xSFunc)(pCtx, pCtx->argc, pCtx->argv);/* IMP: R-24505-23230 */
 
@@ -1841,9 +1829,8 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
 
 	pIn1 = &aMem[pOp->p1];
 	pIn2 = &aMem[pOp->p2];
-	pOut = &aMem[pOp->p3];
+	pOut = vdbe_prepare_null_out(p, pOp->p3);
 	if ((pIn1->flags | pIn2->flags) & MEM_Null) {
-		sqlVdbeMemSetNull(pOut);
 		/* Force NULL be of type INTEGER. */
 		pOut->field_type = FIELD_TYPE_INTEGER;
 		break;
@@ -1889,7 +1876,6 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
 		}
 	}
 	mem_set_i64(pOut, iA);
-	pOut->field_type = field_type_MAX;
 	break;
 }
 
@@ -2107,10 +2093,8 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
 			 * The jump is taken if the SQL_JUMPIFNULL bit is set.
 			 */
 			if (pOp->p5 & SQL_STOREP2) {
-				pOut = &aMem[pOp->p2];
+				pOut = vdbe_prepare_null_out(p, pOp->p2);
 				iCompare = 1;    /* Operands are not equal */
-				memAboutToChange(p, pOut);
-				MemSetTypeFlag(pOut, MEM_Null);
 				REGISTER_TRACE(p, pOp->p2, pOut);
 			} else {
 				VdbeBranchTaken(2,3);
@@ -2228,8 +2212,6 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
 	pIn3->field_type = ft_p3;
 
 	if (pOp->p5 & SQL_STOREP2) {
-		pOut = &aMem[pOp->p2];
-		pOut->field_type = field_type_MAX;
 		iCompare = res;
 		res2 = res2!=0;  /* For this path res2 must be exactly 0 or 1 */
 		if ((pOp->p5 & SQL_KEEPNULL)!=0) {
@@ -2249,7 +2231,7 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
 			testcase( res2==1 && pOp->opcode==OP_Ne);
 			if ((pOp->opcode==OP_Eq)==res2) break;
 		}
-		memAboutToChange(p, pOut);
+		pOut = vdbe_prepare_null_out(p, pOp->p2);
 		mem_set_bool(pOut, res2);
 		REGISTER_TRACE(p, pOp->p2, pOut);
 	} else {
@@ -2438,13 +2420,9 @@ case OP_Or: {             /* same as TK_OR, in1, in2, out3 */
 		static const unsigned char or_logic[] = { 0, 1, 2, 1, 1, 1, 2, 1, 2 };
 		v1 = or_logic[v1*3+v2];
 	}
-	pOut = &aMem[pOp->p3];
-	if (v1==2) {
-		MemSetTypeFlag(pOut, MEM_Null);
-	} else {
+	pOut = vdbe_prepare_null_out(p, pOp->p3);
+	if (v1 != 2)
 		mem_set_bool(pOut, v1);
-	}
-	pOut->field_type = field_type_MAX;
 	break;
 }
 
@@ -2457,8 +2435,8 @@ case OP_Or: {             /* same as TK_OR, in1, in2, out3 */
  */
 case OP_Not: {                /* same as TK_NOT, in1, out2 */
 	pIn1 = &aMem[pOp->p1];
-	pOut = &aMem[pOp->p2];
-	sqlVdbeMemSetNull(pOut);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
+	pOut->field_type = FIELD_TYPE_BOOLEAN;
 	if ((pIn1->flags & MEM_Null)==0) {
 		if ((pIn1->flags & MEM_Bool) == 0) {
 			diag_set(ClientError, ER_SQL_TYPE_MISMATCH,
@@ -2466,7 +2444,6 @@ case OP_Not: {                /* same as TK_NOT, in1, out2 */
 			goto abort_due_to_error;
 		}
 		mem_set_bool(pOut, ! pIn1->u.b);
-		pOut->field_type = field_type_MAX;
 	}
 	break;
 }
@@ -2480,8 +2457,7 @@ case OP_Not: {                /* same as TK_NOT, in1, out2 */
  */
 case OP_BitNot: {             /* same as TK_BITNOT, in1, out2 */
 	pIn1 = &aMem[pOp->p1];
-	pOut = &aMem[pOp->p2];
-	sqlVdbeMemSetNull(pOut);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	/* Force NULL be of type INTEGER. */
 	pOut->field_type = FIELD_TYPE_INTEGER;
 	if ((pIn1->flags & MEM_Null)==0) {
@@ -2609,8 +2585,7 @@ case OP_Column: {
 	p2 = pOp->p2;
 
 	assert(pOp->p3>0 && pOp->p3<=(p->nMem+1 - p->nCursor));
-	pDest = &aMem[pOp->p3];
-	memAboutToChange(p, pDest);
+	pDest = vdbe_prepare_null_out(p, pOp->p3);
 	assert(pOp->p1>=0 && pOp->p1<p->nCursor);
 	assert(pC!=0);
 	assert(p2<pC->nField);
@@ -2627,8 +2602,6 @@ case OP_Column: {
 				vdbe_field_ref_prepare_data(&pC->field_ref,
 							    pReg->z, pReg->n);
 			} else {
-				sqlVdbeMemSetNull(pDest);
-				pDest->field_type = field_type_MAX;
 				goto op_column_out;
 			}
 		} else {
@@ -2699,8 +2672,7 @@ case OP_Fetch: {
 	struct vdbe_field_ref *field_ref =
 		(struct vdbe_field_ref *) p->aMem[pOp->p1].u.p;
 	uint32_t field_idx = pOp->p2;
-	struct Mem *dest_mem = &aMem[pOp->p3];
-	memAboutToChange(p, dest_mem);
+	struct Mem *dest_mem = vdbe_prepare_null_out(p, pOp->p3);
 	if (vdbe_field_ref_fetch(field_ref, field_idx, dest_mem) != 0)
 		goto abort_due_to_error;
 	REGISTER_TRACE(p, pOp->p3, dest_mem);
@@ -2784,8 +2756,7 @@ case OP_MakeRecord: {
 
 	/* Identify the output register */
 	assert(pOp->p3<pOp->p1 || pOp->p3>=pOp->p1+pOp->p2);
-	pOut = &aMem[pOp->p3];
-	memAboutToChange(p, pOut);
+	pOut = vdbe_prepare_null_out(p, pOp->p3);
 
 	/* Apply the requested types to all inputs */
 	assert(pData0<=pLast);
@@ -2860,7 +2831,7 @@ case OP_Count: {         /* out2 */
 		assert((pCrsr->curFlags & BTCF_TEphemCursor) != 0);
 		nEntry = tarantoolsqlEphemeralCount(pCrsr);
 	}
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	mem_set_u64(pOut, nEntry);
 	break;
 }
@@ -3674,7 +3645,7 @@ case OP_Found: {        /* jump, in3 */
 case OP_Sequence: {           /* out2 */
 	assert(pOp->p1>=0 && pOp->p1<p->nCursor);
 	assert(p->apCsr[pOp->p1]!=0);
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	int64_t seq_val = p->apCsr[pOp->p1]->seqCount++;
 	mem_set_u64(pOut, seq_val);
 	break;
@@ -3688,7 +3659,7 @@ case OP_Sequence: {           /* out2 */
  * incremented by one.
  */
 case OP_NextSequenceId: {
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	uint64_t id;
 	tarantoolSqlNextSeqId(&id);
 	id++;
@@ -3721,7 +3692,7 @@ case OP_NextIdEphemeral: {
 		diag_set(ClientError, ER_ROWID_OVERFLOW);
 		goto abort_due_to_error;
 	}
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	mem_set_u64(pOut, rowid);
 	break;
 }
@@ -3749,15 +3720,12 @@ case OP_FCopy: {     /* out2 */
 	}
 
 	if ((pOp->p3 & OPFLAG_NOOP_IF_NULL) && (pIn1->flags & MEM_Null)) {
-		pOut = &aMem[pOp->p2];
-		if (pOut->flags & MEM_Undefined) pOut->flags = MEM_Null;
-		pOut->field_type = field_type_MAX;
-		/* Flag is set and register is NULL -> do nothing  */
+		pOut = vdbe_prepare_null_out(p, pOp->p2);
 	} else {
 		assert(memIsValid(pIn1));
 		assert((pIn1->flags & (MEM_Int | MEM_UInt)) != 0);
 
-		pOut = &aMem[pOp->p2];
+		pOut = vdbe_prepare_null_out(p, pOp->p2);
 		mem_set_int(pOut, pIn1->u.i, pIn1->flags == MEM_Int);
 		pOut->field_type = pIn1->field_type;
 	}
@@ -3885,7 +3853,7 @@ case OP_SorterCompare: {
 case OP_SorterData: {
 	VdbeCursor *pC;
 
-	pOut = &aMem[pOp->p2];
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	pC = p->apCsr[pOp->p1];
 	assert(isSorter(pC));
 	if (sqlVdbeSorterRowkey(pC, pOut) != 0)
@@ -3929,8 +3897,7 @@ case OP_RowData: {
 	}
 #endif
 
-	pOut = out2Prerelease(p, pOp);
-	memAboutToChange(p, pOut);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 
 	assert(pOp->p1>=0 && pOp->p1<p->nCursor);
 	pC = p->apCsr[pOp->p1];
@@ -3957,7 +3924,6 @@ case OP_RowData: {
 	}
 	testcase( n==0);
 
-	sqlVdbeMemRelease(pOut);
 	if (sql_vdbe_mem_alloc_region(pOut, n) != 0)
 		goto abort_due_to_error;
 	sqlCursorPayload(pCrsr, 0, n, pOut->z);
@@ -4862,7 +4828,7 @@ case OP_Program: {        /* jump */
 case OP_Param: {           /* out2 */
 	VdbeFrame *pFrame;
 	Mem *pIn;
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 	pFrame = p->pFrame;
 	pIn = &pFrame->aMem[pOp->p1 + pFrame->aOp[pFrame->pc].p1];
 	sqlVdbeMemShallowCopy(pOut, pIn, MEM_Ephem);
@@ -4960,7 +4926,7 @@ case OP_IfPos: {        /* jump, in1 */
 case OP_OffsetLimit: {    /* in1, out2, in3 */
 	pIn1 = &aMem[pOp->p1];
 	pIn3 = &aMem[pOp->p3];
-	pOut = out2Prerelease(p, pOp);
+	pOut = vdbe_prepare_null_out(p, pOp->p2);
 
 	assert((pIn1->flags & MEM_UInt) != 0);
 	assert((pIn3->flags & MEM_UInt) != 0);
@@ -5225,7 +5191,7 @@ case OP_Init: {          /* jump */
  */
 case OP_IncMaxid: {
 	assert(pOp->p1 > 0);
-	pOut = &aMem[pOp->p1];
+	pOut = vdbe_prepare_null_out(p, pOp->p1);
 
 	if (tarantoolsqlIncrementMaxid(&pOut->u.u) != 0)
 		goto abort_due_to_error;
diff --git a/test/sql-tap/check.test.lua b/test/sql-tap/check.test.lua
index 98f906cf4..19b56d13f 100755
--- a/test/sql-tap/check.test.lua
+++ b/test/sql-tap/check.test.lua
@@ -250,13 +250,14 @@ test:do_catchsql_test(
         -- </check-2.4>
     })
 
-test:do_catchsql_test(
+test:do_execsql_test(
     "check-2.5",
     [[
         INSERT INTO t2 VALUES(4, NULL, 5, NULL);
+        DELETE FROM t2 WHERE id = 4;
     ]], {
         -- <check-2.5>
-        1, "Check constraint failed 'TWO': typeof(coalesce(y,0.1))=='number'"
+
         -- </check-2.5>
     })
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
  2019-08-01 23:11           ` Vladislav Shpilevoy
@ 2019-08-02 10:02             ` n.pettik
  0 siblings, 0 replies; 11+ messages in thread
From: n.pettik @ 2019-08-02 10:02 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Vladislav Shpilevoy


> My review fixes, on top of the branch. If you apply them, then LGTM.
> Otherwise give me your comments, and we will continue the work (it
> won't be LGTM then, *Kirill*, don't push!).

Ok, thanks, they look OK to me so I I’ve squashed them.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] [PATCH 2/2] sql: make default type of NULL be boolean
  2019-07-27 18:45 [tarantool-patches] [PATCH 0/2] Improve operability of typeof() function Nikita Pettik
  2019-07-27 18:45 ` [tarantool-patches] [PATCH 1/2] sql: extend struct Mem with field_type member Nikita Pettik
@ 2019-07-27 18:45 ` Nikita Pettik
  2019-08-02 10:52 ` [tarantool-patches] Re: [PATCH 0/2] Improve operability of typeof() function Kirill Yukhin
  2 siblings, 0 replies; 11+ messages in thread
From: Nikita Pettik @ 2019-07-27 18:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy, Nikita Pettik

It was decided that null value in SQL by default should be of type
boolean. Justification of such change is that according to ANSI boolean
type in fact has three different values: true, false and unknown. The
latter is basically an alias to null value.
---
 src/box/sql/func.c          |  3 ++-
 test/sql-tap/cast.test.lua  |  4 ++--
 test/sql-tap/func.test.lua  | 12 ++++++------
 test/sql-tap/table.test.lua |  2 +-
 test/sql/types.result       |  2 +-
 5 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index f50df105d..06904e082 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -133,10 +133,11 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv)
 		z = "scalar";
 		break;
 	case MP_BOOL:
+	case MP_NIL:
 		z = "boolean";
 		break;
 	default:
-		z = "null";
+		unreachable();
 		break;
 	}
 	sql_result_text(context, z, -1, SQL_STATIC);
diff --git a/test/sql-tap/cast.test.lua b/test/sql-tap/cast.test.lua
index f06cc81d6..96162639c 100755
--- a/test/sql-tap/cast.test.lua
+++ b/test/sql-tap/cast.test.lua
@@ -120,7 +120,7 @@ test:do_execsql_test(
         SELECT typeof(NULL)
     ]], {
         -- <cast-1.12>
-        "null"
+        "boolean"
         -- </cast-1.12>
     })
 
@@ -180,7 +180,7 @@ test:do_execsql_test(
         SELECT typeof(CAST(NULL AS SCALAR))
     ]], {
         -- <cast-1.18>
-        "null"
+        "boolean"
         -- </cast-1.18>
     })
 
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 961fd350a..09407d3fa 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -1802,7 +1802,7 @@ test:do_execsql_test(
         SELECT typeof(replace('This is the main test string', NULL, 'ALT'));
     ]], {
         -- <func-21.3>
-        "null"
+        "boolean"
         -- </func-21.3>
     })
 
@@ -1812,7 +1812,7 @@ test:do_execsql_test(
         SELECT typeof(replace(NULL, 'main', 'ALT'));
     ]], {
         -- <func-21.4>
-        "null"
+        "boolean"
         -- </func-21.4>
     })
 
@@ -1822,7 +1822,7 @@ test:do_execsql_test(
         SELECT typeof(replace('This is the main test string', 'main', NULL));
     ]], {
         -- <func-21.5>
-        "null"
+        "boolean"
         -- </func-21.5>
     })
 
@@ -2043,7 +2043,7 @@ test:do_execsql_test(
         SELECT typeof(trim(NULL));
     ]], {
         -- <func-22.20>
-        "null"
+        "boolean"
         -- </func-22.20>
     })
 
@@ -2053,7 +2053,7 @@ test:do_execsql_test(
         SELECT typeof(TRIM('xyz' FROM NULL));
     ]], {
         -- <func-22.21>
-        "null"
+        "boolean"
         -- </func-22.21>
     })
 
@@ -2063,7 +2063,7 @@ test:do_execsql_test(
         SELECT typeof(TRIM(NULL FROM 'hello'));
     ]], {
         -- <func-22.22>
-        "null"
+        "boolean"
         -- </func-22.22>
     })
 
diff --git a/test/sql-tap/table.test.lua b/test/sql-tap/table.test.lua
index 7f4b15e72..18adf70a9 100755
--- a/test/sql-tap/table.test.lua
+++ b/test/sql-tap/table.test.lua
@@ -906,7 +906,7 @@ test:do_execsql_test(
         FROM t7 LIMIT 1;
     ]], {
         -- <table-11.1>
-        "integer", "number", "string", "string", "null", "null", "string", "string"
+        "integer", "number", "string", "string", "boolean", "boolean", "string", "string"
         -- </table-11.1>
     })
 
diff --git a/test/sql/types.result b/test/sql/types.result
index 6e7fffa42..3750ed2a3 100644
--- a/test/sql/types.result
+++ b/test/sql/types.result
@@ -1771,7 +1771,7 @@ box.execute("SELECT typeof(a), typeof(s) FROM t;")
     type: string
   rows:
   - ['integer', 'integer']
-  - ['integer', 'null']
+  - ['integer', 'boolean']
 ...
 box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
 ---
-- 
2.15.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tarantool-patches] Re: [PATCH 0/2] Improve operability of typeof() function
  2019-07-27 18:45 [tarantool-patches] [PATCH 0/2] Improve operability of typeof() function Nikita Pettik
  2019-07-27 18:45 ` [tarantool-patches] [PATCH 1/2] sql: extend struct Mem with field_type member Nikita Pettik
  2019-07-27 18:45 ` [tarantool-patches] [PATCH 2/2] sql: make default type of NULL be boolean Nikita Pettik
@ 2019-08-02 10:52 ` Kirill Yukhin
  2 siblings, 0 replies; 11+ messages in thread
From: Kirill Yukhin @ 2019-08-02 10:52 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy, Nikita Pettik

Hello,

On 27 Jul 21:45, Nikita Pettik wrote:
> Branch: https://github.com/tarantool/tarantool/compare/np/gh-4148-add-field-type-to-mem
> Issue: https://github.com/tarantool/tarantool/issues/4148
> 
> First patch adds field_type member to struct Mem. It allows to
> improve type calculation for NULL values: when value is fetched
> from tuple, field_type is assigned to the type of corresponding
> field in space format. So that NULL from field with INTEGER type
> now has INTEGER type, from REAL - REAL etc. The only exception is
> SCALAR type since it's not basic type but rather aggregation of
> types. Thus, NULL values from SCALAR field features default type
> (in the next patch it is changed to BOOLEAN).
> 
> Second patch switches default type of NULL literal to boolean as it
> was decided to do during numerous discussions in mailing list. See
> commit message for details.
> 
> Nikita Pettik (2):
>   sql: extend struct Mem with field_type member
>   sql: make default type of NULL be boolean

I've checked the patchset into master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-08-02 10:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-27 18:45 [tarantool-patches] [PATCH 0/2] Improve operability of typeof() function Nikita Pettik
2019-07-27 18:45 ` [tarantool-patches] [PATCH 1/2] sql: extend struct Mem with field_type member Nikita Pettik
2019-07-28 15:22   ` [tarantool-patches] " Vladislav Shpilevoy
2019-07-28 21:26     ` Konstantin Osipov
2019-07-31  9:14     ` n.pettik
2019-07-31 16:59       ` Vladislav Shpilevoy
2019-08-01 15:56         ` n.pettik
2019-08-01 23:11           ` Vladislav Shpilevoy
2019-08-02 10:02             ` n.pettik
2019-07-27 18:45 ` [tarantool-patches] [PATCH 2/2] sql: make default type of NULL be boolean Nikita Pettik
2019-08-02 10:52 ` [tarantool-patches] Re: [PATCH 0/2] Improve operability of typeof() function Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox