[Tarantool-patches] [PATCH 2/3] sql: CAST(<boolean> AS TEXT) returns lowercase

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Oct 29 00:48:08 MSK 2019


Hi! Thanks for the review!

On 28/10/2019 15:08, Nikita Pettik wrote:
> On 27 Oct 22:35, Vladislav Shpilevoy wrote:
>> Implicit cast uses lowercase, and the patch makes the
>> explicit cast the same.
>>
>> No any special reason behind that. Lower case is
>> already used much more often, so it is easier to drop
>> the upper case.
>>
>> Part of #4462
>> ---
> 
> Oh, unfortunately I've remembered the reason of using uppercase: it is
> said so by ANSI (2011):

Not unfortunately, it is good. Here is the incremental diff:

=======================================================================

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 12a4bee04..ace5d1612 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1411,8 +1411,9 @@ quoteFunc(sql_context * context, int argc, sql_value ** argv)
 			break;
 		}
 	case MP_BOOL: {
-		sql_result_text(context, sql_value_boolean(argv[0]) ?
-				"true" : "false", -1, SQL_TRANSIENT);
+		sql_result_text(context,
+				SQL_TOKEN_BOOLEAN(sql_value_boolean(argv[0])),
+				-1, SQL_TRANSIENT);
 		break;
 	}
 	default:{
diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
index c32cacc04..95eadbf87 100644
--- a/src/box/sql/sqlInt.h
+++ b/src/box/sql/sqlInt.h
@@ -306,6 +306,15 @@ struct sql_vfs {
 	*/
 };
 
+/**
+ * Canonical string representation of SQL BOOLEAN values.
+ * According to the standard it should be uppercase. See the 2011
+ * standard, cast specification 6.13, general rules 11.e.
+ */
+#define SQL_TOKEN_TRUE "TRUE"
+#define SQL_TOKEN_FALSE "FALSE"
+#define SQL_TOKEN_BOOLEAN(v) ({(v) ? SQL_TOKEN_TRUE : SQL_TOKEN_FALSE;})
+
 #define SQL_LIMIT_LENGTH                    0
 #define SQL_LIMIT_SQL_LENGTH                1
 #define SQL_LIMIT_COLUMN                    2
diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index f881a732e..03c63d847 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -536,7 +536,7 @@ memTracePrint(Mem *p)
 	} else if (p->flags & MEM_Real) {
 		printf(" r:%g", p->u.r);
 	} else if (p->flags & MEM_Bool) {
-		printf(" bool:%s", p->u.b ? "true" : "false");
+		printf(" bool:%s", SQL_TOKEN_BOOLEAN(p->u.b));
 	} else {
 		char zBuf[200];
 		sqlVdbeMemPrettyPrint(p, zBuf);
diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
index 6964c9097..2d37b62be 100644
--- a/src/box/sql/vdbemem.c
+++ b/src/box/sql/vdbemem.c
@@ -321,7 +321,8 @@ sqlVdbeMemStringify(Mem * pMem)
 		sql_snprintf(nByte, pMem->z, "%llu", pMem->u.u);
 		pMem->flags &= ~MEM_UInt;
 	} else if ((fg & MEM_Bool) != 0) {
-		sql_snprintf(nByte, pMem->z, "%s", pMem->u.b ? "true" : "false");
+		sql_snprintf(nByte, pMem->z, "%s",
+			     SQL_TOKEN_BOOLEAN(pMem->u.b));
 		pMem->flags &= ~MEM_Bool;
 	} else if (mem_has_msgpack_subtype(pMem)) {
 		sql_snprintf(nByte, pMem->z, "%s", value);
@@ -645,10 +646,11 @@ str_cast_to_boolean(const char *str, bool *result)
 {
 	assert(str != NULL);
 	for (; *str == ' '; str++);
-	if (strncasecmp(str, "true", 4) == 0) {
+	if (strncasecmp(str, SQL_TOKEN_TRUE, strlen(SQL_TOKEN_TRUE)) == 0) {
 		*result = true;
 		str += 4;
-	} else if (strncasecmp(str, "false", 5) == 0) {
+	} else if (strncasecmp(str, SQL_TOKEN_FALSE,
+			       strlen(SQL_TOKEN_FALSE)) == 0) {
 		*result = false;
 		str += 5;
 	} else {
@@ -753,7 +755,7 @@ sqlVdbeMemCast(Mem * pMem, enum field_type type)
 		assert(type == FIELD_TYPE_STRING);
 		assert(MEM_Str == (MEM_Blob >> 3));
 		if ((pMem->flags & MEM_Bool) != 0) {
-			const char *str_bool = pMem->u.b ? "true" : "false";
+			const char *str_bool = SQL_TOKEN_BOOLEAN(pMem->u.b);
 			sqlVdbeMemSetStr(pMem, str_bool, strlen(str_bool), 1,
 					 SQL_TRANSIENT);
 			return 0;
diff --git a/test/sql/boolean.result b/test/sql/boolean.result
index 191fa33f3..e5a00696b 100644
--- a/test/sql/boolean.result
+++ b/test/sql/boolean.result
@@ -161,7 +161,7 @@ SELECT s FROM ts WHERE s IN (true, 1, 'abcd');
 INSERT INTO ts VALUES (true, 12345);
  | ---
  | - null
- | - 'Type mismatch: can not convert true to integer'
+ | - 'Type mismatch: can not convert TRUE to integer'
  | ...
 

  	... and tens of similar changes in the same file ...

diff --git a/test/sql/boolean.test.sql b/test/sql/boolean.test.sql
index 63eb505c5..65a830349 100644
--- a/test/sql/boolean.test.sql
+++ b/test/sql/boolean.test.sql
@@ -129,7 +129,7 @@ INSERT INTO t3 VALUES (4, false)
 -- Check CAST from BOOLEAN to the other types.
 SELECT cast(true AS INTEGER), cast(false AS INTEGER);
 SELECT cast(true AS NUMBER), cast(false AS NUMBER);
--- gh-4462: ensure that text representation is lowercase.
+-- gh-4462: ensure that text representation is uppercase.
 SELECT cast(true AS TEXT), cast(false AS TEXT);
 SELECT cast(true AS BOOLEAN), cast(false AS BOOLEAN);
 
diff --git a/test/sql/types.result b/test/sql/types.result
index 1bb8b0cec..37350f034 100644
--- a/test/sql/types.result
+++ b/test/sql/types.result
@@ -1232,7 +1232,7 @@ box.execute("INSERT INTO t VALUES(1, 1.123);")
 box.execute("INSERT INTO t VALUES(1, true);")
 ---
 - null
-- 'Type mismatch: can not convert true to varbinary'
+- 'Type mismatch: can not convert TRUE to varbinary'
 ...
 box.execute("INSERT INTO t VALUES(1, 'asd');")
 ---
@@ -1464,7 +1464,7 @@ box.execute("SELECT CAST(1.123 AS VARBINARY);")
 box.execute("SELECT CAST(true AS VARBINARY);")
 ---
 - null
-- 'Type mismatch: can not convert true to varbinary'
+- 'Type mismatch: can not convert TRUE to varbinary'
 ...
 box.execute("SELECT CAST('asd' AS VARBINARY);")
 ---

=======================================================================

New commit message:

    sql: implicit boolean cast to text returns uppercase
    
    Explicit cast uses uppercase, and the patch makes the
    implicit cast the same.
    
    Upper case is the standard according to SQL standard
    2011, cast specification 6.13, general rules 11.e:
    
        General rules
    
        11) If TD is variable-length character string or
            large object character string, then let MLTD
            be the maximum length in characters of TD.
    
            e) If SD (source type) is boolean, then Case:
    
                i)   If SV is True and MLTD is not less
                     than 4, then TV is 'TRUE'.
    
                ii)  If SV is False and MLTD is not less
                     than 5, then TV is 'FALSE'.
    
                iii) Otherwise, an exception condition is
                     raised: data exception — invalid
                     character value for cast.
    
    Part of #4462

> As for implicit cast - it is allowed neither in ANSI nor in Tarantool SQL.
> However, some functions (like length or group_concat) anyway provide implicit
> conversion from boolean to string. Why not simply disallow that?

We will disallow. But we have different tickets for that. Goal of this commit
and this ticket is to stabilize what we have at this moment until we drop
implicit casts. Not only boolean related, but all of them.

> Earlier I suggested to remove all dispatching logic from built-ins
> implementations and instead generate OP_ApplyType with function's argument
> and its assumed type. AfAIR Mergen faced some problems with that, but I
> believe it is likely to be most general and correct solution.
> 

Probably it is not a place to discuss it, but are you sure it is a good idea
to emit OP_ApplyType for each argument of each function before it is called?
This is one of the main reasons why we have implicit casts: now functions
internally transform types just like OP_ApplyType does. They transform
strings to numbers, and back. OP_ApplyType does the same, no? I was thinking
that on the contrary we are going to reduce usage of OP_ApplyType, as well
as drop manual casts from functions.


More information about the Tarantool-patches mailing list