From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <v.shpilevoy@tarantool.org>
Received: from smtp38.i.mail.ru (smtp38.i.mail.ru [94.100.177.98])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id 5CB27452566
 for <tarantool-patches@dev.tarantool.org>;
 Sat,  2 Nov 2019 20:44:19 +0300 (MSK)
References: <7361e30e0857bf3c4d05b72713bbdf2596a1452f.1572693924.git.imeevma@gmail.com>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <0a281b27-407a-9ab6-6d75-1564142a9bf9@tarantool.org>
Date: Sat, 2 Nov 2019 18:49:58 +0100
MIME-Version: 1.0
In-Reply-To: <7361e30e0857bf3c4d05b72713bbdf2596a1452f.1572693924.git.imeevma@gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [Tarantool-patches] [PATCH v1 1/1] sql: make NUMBER to be union
 of SQL numeric types
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: imeevma@tarantool.org
Cc: tarantool-patches@freelists.org, tarantool-patches@dev.tarantool.org

Thanks for the patch!

See 5 comments below.

On 02/11/2019 12:30, imeevma@tarantool.org wrote:
> This patch makes number to be union of UNSIGNED, INTEGER and
> DOUBLE numeric types. The first two types represented by Tarantool
> UNSIGNED and INTEGER types. Currently there is not DOUBLE type in
> Tarantool.
> 
> Closes #4233
> Closes #4463
> ---
> https://github.com/tarantool/tarantool/branches

https://github.com/tarantool/tarantool/tree/imeevma/gh-4233-fix-number-field-type-in-sql

> https://github.com/tarantool/tarantool/issues/4233
> https://github.com/tarantool/tarantool/issues/4463
> 
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index 03c63d8..61e6487 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -1660,6 +1657,7 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
>  				 sql_value_to_diag_str(pIn2), "numeric");
>  			goto abort_due_to_error;
>  		}
> +		assert(((type1 | type2) & MEM_Real) !=0);

1. Please, put a whitespace after '!='.

>  		switch( pOp->opcode) {
>  		case OP_Add:         rB += rA;       break;
>  		case OP_Subtract:    rB -= rA;       break;
> diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
> index 2d37b62..6bfd4c0 100644
> --- a/src/box/sql/vdbemem.c
> +++ b/src/box/sql/vdbemem.c
> @@ -596,6 +596,34 @@ sqlVdbeMemRealify(Mem * pMem)
> +static int
> +sql_mem_to_number(struct Mem *mem)
> +{
> +	assert((mem->flags & (MEM_Int | MEM_UInt | MEM_Real | MEM_Null)) == 0);
> +	if ((mem->flags & (MEM_Blob | MEM_Str)) == 0)
> +		return -1;
> +	assert(mem->z[mem->n] == '\0');
> +	char *tail = NULL;
> +	long long unsigned int value_u = strtoull(mem->z, &tail, 10);
> +	if (tail[0] == '\0' && value_u <= UINT64_MAX) {

2. I think you can safely assume, that long long unsigned == uint64_t.
As well as long long int == int64_t.

> +		mem->u.u = value_u;
> +		MemSetTypeFlag(mem, MEM_UInt);
> +		return 0;
> +	}
> +	long long int value_i = strtoll(mem->z, &tail, 10);
> +	if (tail[0] == '\0' && value_i <= INT64_MAX && value_i >= INT64_MIN) {
> +		mem->u.i = value_i;
> +		MemSetTypeFlag(mem, MEM_Int);
> +		return 0;
> +	}

3. strtoull and strtoll are very mean functions. It is easy to
screw their results:

tarantool> box.execute("SELECT CAST('  9223372036854774800 ' AS NUMBER);")
---
- metadata:
  - name: CAST('  9223372036854774800 ' AS NUMBER)
    type: number
  rows:
  - [9223372036854774784]
...

tarantool> box.execute("SELECT CAST('  9223372036854774800' AS NUMBER);")
---
- metadata:
  - name: CAST('  9223372036854774800' AS NUMBER)
    type: number
  rows:
  - [9223372036854774800]
...

Here presence of whitespaces in the end of a number string affects the
cast result. So you can't actually treat tail[0] == 0 as a sign of
success.

One another example:

tarantool> box.execute("SELECT CAST('' AS NUMBER);")
---
- metadata:
  - name: CAST('' AS NUMBER)
    type: number
  rows:
  - [0]
...

You can find how we work with them in expr_code_int(), sql_atoi64(),
lbox_tonumber64().

> +	double value_d;
> +	if (sqlAtoF(mem->z, &value_d, mem->n) == 0)
> +		return -1;

4. Why not 'strtod'?

> +	mem->u.r = value_d;
> +	MemSetTypeFlag(mem, MEM_Real);
> +	return 0;
> +}
> +
> diff --git a/test/sql-tap/numcast.test.lua b/test/sql-tap/numcast.test.lua
> index 9f72485..1355bad 100755
> --- a/test/sql-tap/numcast.test.lua
> +++ b/test/sql-tap/numcast.test.lua
> @@ -65,5 +65,107 @@ for _, enc in ipairs({"utf8"}) do
> +test:do_execsql_test(
> +    "numcast-2.7",
> +    [[
> +        SELECT CAST('101     ' AS NUMBER) / 10, CAST('      101' AS NUMBER) / 10;
> +    ]], {
> +        10.1, 10

5. So this is actually incorrect. Why a number of whitespaces in the end
affects the result type?