From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounce@freelists.org>
Received: from localhost (localhost [127.0.0.1])
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B86CB2576F
	for <tarantool-patches@freelists.org>; Wed, 16 Jan 2019 09:26:13 -0500 (EST)
Received: from turing.freelists.org ([127.0.0.1])
	by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id sUR9kuKkOPW0 for <tarantool-patches@freelists.org>;
	Wed, 16 Jan 2019 09:26:13 -0500 (EST)
Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 47A172569E
	for <tarantool-patches@freelists.org>; Wed, 16 Jan 2019 09:26:13 -0500 (EST)
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 12.0 \(3445.100.39\))
Subject: [tarantool-patches] Re: [PATCH 3/8] sql: remove numeric affinity
From: "n.pettik" <korablev@tarantool.org>
In-Reply-To: <28e6da44-2d1e-87b5-741d-a791750db6c4@tarantool.org>
Date: Wed, 16 Jan 2019 17:26:11 +0300
Content-Transfer-Encoding: quoted-printable
Message-Id: <AAE8BC06-EFD7-467A-ADF1-7369837B277A@tarantool.org>
References: <cover.1545987214.git.korablev@tarantool.org>
 <58b4d75729413f02134c72886ecbc749e510a1d1.1545987214.git.korablev@tarantool.org>
 <28e6da44-2d1e-87b5-741d-a791750db6c4@tarantool.org>
Sender: tarantool-patches-bounce@freelists.org
Errors-to: tarantool-patches-bounce@freelists.org
Reply-To: tarantool-patches@freelists.org
List-help: <mailto:ecartis@freelists.org?Subject=help>
List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: tarantool-patches <tarantool-patches.freelists.org>
List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe>
List-owner: <mailto:>
List-post: <mailto:tarantool-patches@freelists.org>
List-archive: <http://www.freelists.org/archives/tarantool-patches>
To: tarantool-patches@freelists.org
Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>


>> Numeric affinity in SQLite means the same as real, except that it
>> forces integer values into floating point representation in case
>> it can be converted without loss (e.g. 2.0 -> 2).
>> Since in Tarantool core there is no difference between numeric and =
real
>> values (both are stored as values of type NUMBER), lets remove =
numeric
>> affinity and use instead real.
>> The only real pitfall is implicit conversion mentioned above.
>> What is more, vinyl engine complicates problem since it relies
>> on data encoding (i.e. whether it is encoded as MP_INT or MP_FLOAT).
>> For instance, if we encode 1.0 as MP_FLOAT during insertion, we won't
>> be able to use iterators from Lua, since they implicitly change type =
of
>> 1.0 and pass it to the iterator as MP_INT.  Solution to this problem =
is
>> simple: lets always attempt at encoding floats as ints if conversion
>> takes place without loss. This is a straightforward approach, but to
>> implement it we need to care about reversed (decoding) situation.
>=20
> The bug is confirmed: =
https://github.com/tarantool/tarantool/issues/3907
>=20
> I agree with Kostja - it is just a bug, that Vinyl treats differently
> integers and their double casts. It should not affect design decisions
> of this patchset.

Ok, we may consider part of this patch as a workaround.
Now affinity removal would significantly help me to fix
other issues connected with strict typing (e.g. implicit casts).
afterwards code introduced in this commit may be simplified/removed.
I don=E2=80=99t know what priority of #3907 issue (milestone is 2.1.1 =
but
we know that sometimes it may take a while).

> But there is another reason why we can't pass *.0 as an iterator value =
-
> our fast comparators (TupleCompare, TupleCompareWithKey) are designed =
to
> work with only values of same MP_ type. They do not use slow
> tuple_compare_field() which is able to compare double and integer.

Yep, it is sad. I can=E2=80=99t say anything more now,I need to "think =
about it=E2=80=9D.

The only workaround I can come up right now with is to add to
every OP_Found/OP_Seek etc opcode OP_PromoteType.
It would attempt at fetching space from cursor and gently
applying types from format to record to be passed to iterator.
But this doesn=E2=80=99t seem to be acceptable solution for many =
reasons.

Otherwise, I have no idea how to determine required conversions
*.0 -> * without execution context.

> Moreover, I think, we should forbid implicit *.0 -> * This is why we
> designed strict typing, isn't it?

It is too strict rule, ANSI allows this kind of conversion.=20
The only restriction is that there shouldn=E2=80=99t be precision loss
during conversion. It has been already discussed with
Konstantin and Peter Gulutzan, I inline part of that thread
(Topic =E2=80=9C[dev] Re: Casts")

> The standard rules for casting look like this
> (where SD =3D source data type, TD =3D target data type, SV =3D source =
value):
> "
> 8) If TD is exact numeric, then
> Case:
> a)If SD is exact numeric or approximate numeric, then
> Case:
>   i) If there is a representation of SV in the data type TD that does
>      not lose any leading significant digits after rounding or =
truncating
>      if necessary, then TV is that representation.
>      The choice of whether to round or truncate is =
implementation-defined.
>   ii)Otherwise, an exception condition is raised:
>      data exception =E2=80=94 numeric value out of range.
> "
> So truncating 1.123 to 1 is okay, but truncating 11.123 to 1.123 is =
not okay
> (that loses a leading significant digit).
>=20

Even more simple situation:

CREATE TABLE t1 (id INT PRIMARY KEY);
INSERT INTO t1 VALUES (1.0);

This example is valid and all DBs support it.

> If a table has column int, we should not be able to select it by =
double
> values without explicit cast, even if implicit is possible without =
loss.
> For explicit cast users have CAST operator.

It would look at least strange:

SELECT * FROM t1 WHERE x =3D (1.0 AS INT);

Again, all DBs I know accept this query without cast.

>=20
>> OP_Column fetches from msgpack field with given number and stores it =
as
>> a native VDBE memory object. Type of that memory is based on type of
>> msgpack value. So, if space field is of type NUMBER and holds value =
1,
>> type of VDBE memory will be INT (after decoding), not float 1.0.  As =
a
>> result, further calculations may be wrong: for instance, instead of
>> floating point division, we could get integer division.  To cope with
>> this problem, lets add auxiliary conversion to decoding routine which
>> uses space format of tuple to be decoded. It is worth mentioning that
>> ephemeral spaces don't feature space format, so we are going to rely =
on
>> type of key parts. Finally, internal VDBE merge sorter also operates =
on
>> entries encoded into msgpack. To fix this case, we check type of
>> ORDER BY/GROUP BY arguments: if they are of type float, we are =
emitting
>> additional opcode OP_AffinityReal to force float type after encoding.
>> Part of #3698
>> ---
>>  src/box/field_def.h                  |  1 -
>>  src/box/lua/lua_sql.c                |  2 +-
>>  src/box/sql.c                        | 10 +++++++---
>>  src/box/sql/build.c                  |  1 -
>>  src/box/sql/expr.c                   | 20 +++++++++++---------
>>  src/box/sql/select.c                 | 10 +++++++---
>>  src/box/sql/sqliteInt.h              |  2 +-
>>  src/box/sql/vdbe.c                   | 26 ++++++++++++++++++--------
>>  src/box/sql/vdbeaux.c                | 19 ++++++++++++++++---
>>  src/box/sql/vdbemem.c                |  7 ++-----
>>  test/sql-tap/tkt-80e031a00f.test.lua | 12 ++++++------
>>  11 files changed, 69 insertions(+), 41 deletions(-)
>> diff --git a/src/box/sql.c b/src/box/sql.c
>> index a06c50dca..a498cd8fe 100644
>> --- a/src/box/sql.c
>> +++ b/src/box/sql.c
>> @@ -376,14 +376,18 @@ sql_ephemeral_space_create(uint32_t =
field_count, struct sql_key_info *key_info)
>>  	for (uint32_t i =3D 0; i < field_count; ++i) {
>>  		struct key_part_def *part =3D &ephemer_key_parts[i];
>>  		part->fieldno =3D i;
>> -		part->type =3D FIELD_TYPE_SCALAR;
>>  		part->nullable_action =3D ON_CONFLICT_ACTION_NONE;
>>  		part->is_nullable =3D true;
>>  		part->sort_order =3D SORT_ORDER_ASC;
>> -		if (def !=3D NULL && i < def->part_count)
>> +		if (def !=3D NULL && i < def->part_count) {
>> +			assert(def->parts[i].type < field_type_MAX);
>> +			part->type =3D def->parts[i].type !=3D =
FIELD_TYPE_ANY ?
>> +				     def->parts[i].type : =
FIELD_TYPE_SCALAR;
>>  			part->coll_id =3D def->parts[i].coll_id;
>=20
> 1. How can key_part have FIELD_TYPE_ANY? We have no comparators for =
ANY
> type, it is impossible, isn't it?

We don=E2=80=99t, and that is why I replace ANY with SCALAR.

>=20
>> -		else
>> +		} else {
>>  			part->coll_id =3D COLL_NONE;
>> +			part->type =3D FIELD_TYPE_SCALAR;
>> +		}
>>  	}
>>  	struct key_def *ephemer_key_def =3D =
key_def_new(ephemer_key_parts,
>>  						      field_count);
>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
>> index b67b22c23..7a0b929a7 100644
>> --- a/src/box/sql/expr.c
>> +++ b/src/box/sql/expr.c
>> @@ -3700,6 +3698,10 @@ sqlite3ExprCodeTarget(Parse * pParse, Expr * =
pExpr, int target)
>>  				sqlite3VdbeAddOp3(v, OP_Column,
>>  						  =
pAggInfo->sortingIdxPTab,
>>  						  pCol->iSorterColumn, =
target);
>> +				if =
(pCol->space_def->fields[pExpr->iAgg].type =3D=3D
>> +				    FIELD_TYPE_NUMBER)
>> +					sqlite3VdbeAddOp1(v, =
OP_RealAffinity,
>> +							  target);
>=20
> 2. Please, use {} for multi-line 'if' body.

Ok (I always confuse multi-line with multi-statement body):

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 7a0b929a7..14b727039 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -3699,9 +3699,10 @@ sqlite3ExprCodeTarget(Parse * pParse, Expr * =
pExpr, int target)
                                                  =
pAggInfo->sortingIdxPTab,
                                                  pCol->iSorterColumn, =
target);
                                if =
(pCol->space_def->fields[pExpr->iAgg].type =3D=3D
-                                   FIELD_TYPE_NUMBER)
+                                   FIELD_TYPE_NUMBER) {
                                        sqlite3VdbeAddOp1(v, =
OP_RealAffinity,
                                                          target);
+                               }
                                return target;
                        }


>=20
>>  				return target;
>>  			}
>>  			/* Otherwise, fall thru into the TK_COLUMN case =
*/
>> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
>> index 02ee225f1..40336e679 100644
>> --- a/src/box/sql/select.c
>> +++ b/src/box/sql/select.c
>> @@ -1427,6 +1427,8 @@ sql_expr_list_to_key_info(struct Parse *parse, =
struct ExprList *list, int start)
>>  		sql_expr_coll(parse, item->pExpr, &unused, &id);
>>  		part->coll_id =3D id;
>>  		part->sort_order =3D item->sort_order;
>> +		enum affinity_type aff =3D =
sqlite3ExprAffinity(item->pExpr);
>> +		part->type =3Dsql_affinity_to_field_type(aff);
>=20
> 3. Lost white-space after '=3D=E2=80=98.

Fixed:

diff --git a/src/box/sql/select.c b/src/box/sql/select.c
index 40336e679..fa0fea6c8 100644
--- a/src/box/sql/select.c
+++ b/src/box/sql/select.c
@@ -1428,7 +1428,7 @@ sql_expr_list_to_key_info(struct Parse *parse, =
struct ExprList *list, int start)
                part->coll_id =3D id;
                part->sort_order =3D item->sort_order;
                enum affinity_type aff =3D =
sqlite3ExprAffinity(item->pExpr);
-               part->type =3Dsql_affinity_to_field_type(aff);
+               part->type =3D sql_affinity_to_field_type(aff);

>=20
>>  	}
>>  	return key_info;
>>  }
>> @@ -5789,8 +5790,11 @@ sqlite3Select(Parse * pParse,		/* The =
parser context */
>>  	/* If the output is destined for a temporary table, open that =
table.
>>  	 */
>>  	if (pDest->eDest =3D=3D SRT_EphemTab) {
>> -		sqlite3VdbeAddOp2(v, OP_OpenTEphemeral, pDest->reg_eph,
>> -				  pEList->nExpr + 1);
>> +		struct sql_key_info *key_info =3D
>> +			sql_expr_list_to_key_info(pParse, pEList, 0);
>> +		sqlite3VdbeAddOp4(v, OP_OpenTEphemeral, pDest->reg_eph,
>> +				  pEList->nExpr + 1, 0, (char =
*)key_info,
>> +				  P4_KEYINFO);
>>  		sqlite3VdbeAddOp3(v, OP_IteratorOpen, pDest->iSDParm, 0,
>>  				  pDest->reg_eph);
>=20
> 4. Why can some ephemeral spaces be created without key_info? I =
grepped
> by OP_OpenTEphemeral and saw that somewhere it is still Op2, not Op4.

If there is no key_info, then default one is used instead.
See sql_ephemeral_space_create and how we cope
with situation when key_info =3D=3D NULL.

>=20
>>  diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
>> index e6b413c70..4345af24e 100644
>> --- a/src/box/sql/vdbe.c
>> +++ b/src/box/sql/vdbe.c
>> @@ -2173,7 +2167,7 @@ case OP_Ge: {             /* same as TK_GE, =
jump, in1, in3 */
>>  	} else {
>>  		/* Neither operand is NULL.  Do a comparison. */
>>  		affinity =3D pOp->p5 & AFFINITY_MASK;
>> -		if (affinity>=3DAFFINITY_NUMERIC) {
>> +		if (affinity>=3DAFFINITY_INTEGER) {
>=20
> 5. I guess, we have sqlite3IsNumericAffinity for this.

Anyway I replace it with sql_affinity_is_numeric in
sql: clean-up affinity from SQL source code patch.