Tarantool development patches archive
 help / color / mirror / Atom feed
From: "n.pettik" <korablev@tarantool.org>
To: tarantool-patches@freelists.org
Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH 3/8] sql: remove numeric affinity
Date: Wed, 16 Jan 2019 17:26:11 +0300	[thread overview]
Message-ID: <AAE8BC06-EFD7-467A-ADF1-7369837B277A@tarantool.org> (raw)
In-Reply-To: <28e6da44-2d1e-87b5-741d-a791750db6c4@tarantool.org>


>> Numeric affinity in SQLite means the same as real, except that it
>> forces integer values into floating point representation in case
>> it can be converted without loss (e.g. 2.0 -> 2).
>> Since in Tarantool core there is no difference between numeric and real
>> values (both are stored as values of type NUMBER), lets remove numeric
>> affinity and use instead real.
>> The only real pitfall is implicit conversion mentioned above.
>> What is more, vinyl engine complicates problem since it relies
>> on data encoding (i.e. whether it is encoded as MP_INT or MP_FLOAT).
>> For instance, if we encode 1.0 as MP_FLOAT during insertion, we won't
>> be able to use iterators from Lua, since they implicitly change type of
>> 1.0 and pass it to the iterator as MP_INT.  Solution to this problem is
>> simple: lets always attempt at encoding floats as ints if conversion
>> takes place without loss. This is a straightforward approach, but to
>> implement it we need to care about reversed (decoding) situation.
> 
> The bug is confirmed: https://github.com/tarantool/tarantool/issues/3907
> 
> I agree with Kostja - it is just a bug, that Vinyl treats differently
> integers and their double casts. It should not affect design decisions
> of this patchset.

Ok, we may consider part of this patch as a workaround.
Now affinity removal would significantly help me to fix
other issues connected with strict typing (e.g. implicit casts).
afterwards code introduced in this commit may be simplified/removed.
I don’t know what priority of #3907 issue (milestone is 2.1.1 but
we know that sometimes it may take a while).

> But there is another reason why we can't pass *.0 as an iterator value -
> our fast comparators (TupleCompare, TupleCompareWithKey) are designed to
> work with only values of same MP_ type. They do not use slow
> tuple_compare_field() which is able to compare double and integer.

Yep, it is sad. I can’t say anything more now,I need to "think about it”.

The only workaround I can come up right now with is to add to
every OP_Found/OP_Seek etc opcode OP_PromoteType.
It would attempt at fetching space from cursor and gently
applying types from format to record to be passed to iterator.
But this doesn’t seem to be acceptable solution for many reasons.

Otherwise, I have no idea how to determine required conversions
*.0 -> * without execution context.

> Moreover, I think, we should forbid implicit *.0 -> * This is why we
> designed strict typing, isn't it?

It is too strict rule, ANSI allows this kind of conversion. 
The only restriction is that there shouldn’t be precision loss
during conversion. It has been already discussed with
Konstantin and Peter Gulutzan, I inline part of that thread
(Topic “[dev] Re: Casts")

> The standard rules for casting look like this
> (where SD = source data type, TD = target data type, SV = source value):
> "
> 8) If TD is exact numeric, then
> Case:
> a)If SD is exact numeric or approximate numeric, then
> Case:
>   i) If there is a representation of SV in the data type TD that does
>      not lose any leading significant digits after rounding or truncating
>      if necessary, then TV is that representation.
>      The choice of whether to round or truncate is implementation-defined.
>   ii)Otherwise, an exception condition is raised:
>      data exception — numeric value out of range.
> "
> So truncating 1.123 to 1 is okay, but truncating 11.123 to 1.123 is not okay
> (that loses a leading significant digit).
> 

Even more simple situation:

CREATE TABLE t1 (id INT PRIMARY KEY);
INSERT INTO t1 VALUES (1.0);

This example is valid and all DBs support it.

> If a table has column int, we should not be able to select it by double
> values without explicit cast, even if implicit is possible without loss.
> For explicit cast users have CAST operator.

It would look at least strange:

SELECT * FROM t1 WHERE x = (1.0 AS INT);

Again, all DBs I know accept this query without cast.

> 
>> OP_Column fetches from msgpack field with given number and stores it as
>> a native VDBE memory object. Type of that memory is based on type of
>> msgpack value. So, if space field is of type NUMBER and holds value 1,
>> type of VDBE memory will be INT (after decoding), not float 1.0.  As a
>> result, further calculations may be wrong: for instance, instead of
>> floating point division, we could get integer division.  To cope with
>> this problem, lets add auxiliary conversion to decoding routine which
>> uses space format of tuple to be decoded. It is worth mentioning that
>> ephemeral spaces don't feature space format, so we are going to rely on
>> type of key parts. Finally, internal VDBE merge sorter also operates on
>> entries encoded into msgpack. To fix this case, we check type of
>> ORDER BY/GROUP BY arguments: if they are of type float, we are emitting
>> additional opcode OP_AffinityReal to force float type after encoding.
>> Part of #3698
>> ---
>>  src/box/field_def.h                  |  1 -
>>  src/box/lua/lua_sql.c                |  2 +-
>>  src/box/sql.c                        | 10 +++++++---
>>  src/box/sql/build.c                  |  1 -
>>  src/box/sql/expr.c                   | 20 +++++++++++---------
>>  src/box/sql/select.c                 | 10 +++++++---
>>  src/box/sql/sqliteInt.h              |  2 +-
>>  src/box/sql/vdbe.c                   | 26 ++++++++++++++++++--------
>>  src/box/sql/vdbeaux.c                | 19 ++++++++++++++++---
>>  src/box/sql/vdbemem.c                |  7 ++-----
>>  test/sql-tap/tkt-80e031a00f.test.lua | 12 ++++++------
>>  11 files changed, 69 insertions(+), 41 deletions(-)
>> diff --git a/src/box/sql.c b/src/box/sql.c
>> index a06c50dca..a498cd8fe 100644
>> --- a/src/box/sql.c
>> +++ b/src/box/sql.c
>> @@ -376,14 +376,18 @@ sql_ephemeral_space_create(uint32_t field_count, struct sql_key_info *key_info)
>>  	for (uint32_t i = 0; i < field_count; ++i) {
>>  		struct key_part_def *part = &ephemer_key_parts[i];
>>  		part->fieldno = i;
>> -		part->type = FIELD_TYPE_SCALAR;
>>  		part->nullable_action = ON_CONFLICT_ACTION_NONE;
>>  		part->is_nullable = true;
>>  		part->sort_order = SORT_ORDER_ASC;
>> -		if (def != NULL && i < def->part_count)
>> +		if (def != NULL && i < def->part_count) {
>> +			assert(def->parts[i].type < field_type_MAX);
>> +			part->type = def->parts[i].type != FIELD_TYPE_ANY ?
>> +				     def->parts[i].type : FIELD_TYPE_SCALAR;
>>  			part->coll_id = def->parts[i].coll_id;
> 
> 1. How can key_part have FIELD_TYPE_ANY? We have no comparators for ANY
> type, it is impossible, isn't it?

We don’t, and that is why I replace ANY with SCALAR.

> 
>> -		else
>> +		} else {
>>  			part->coll_id = COLL_NONE;
>> +			part->type = FIELD_TYPE_SCALAR;
>> +		}
>>  	}
>>  	struct key_def *ephemer_key_def = key_def_new(ephemer_key_parts,
>>  						      field_count);
>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
>> index b67b22c23..7a0b929a7 100644
>> --- a/src/box/sql/expr.c
>> +++ b/src/box/sql/expr.c
>> @@ -3700,6 +3698,10 @@ sqlite3ExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
>>  				sqlite3VdbeAddOp3(v, OP_Column,
>>  						  pAggInfo->sortingIdxPTab,
>>  						  pCol->iSorterColumn, target);
>> +				if (pCol->space_def->fields[pExpr->iAgg].type ==
>> +				    FIELD_TYPE_NUMBER)
>> +					sqlite3VdbeAddOp1(v, OP_RealAffinity,
>> +							  target);
> 
> 2. Please, use {} for multi-line 'if' body.

Ok (I always confuse multi-line with multi-statement body):

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 7a0b929a7..14b727039 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -3699,9 +3699,10 @@ sqlite3ExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
                                                  pAggInfo->sortingIdxPTab,
                                                  pCol->iSorterColumn, target);
                                if (pCol->space_def->fields[pExpr->iAgg].type ==
-                                   FIELD_TYPE_NUMBER)
+                                   FIELD_TYPE_NUMBER) {
                                        sqlite3VdbeAddOp1(v, OP_RealAffinity,
                                                          target);
+                               }
                                return target;
                        }


> 
>>  				return target;
>>  			}
>>  			/* Otherwise, fall thru into the TK_COLUMN case */
>> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
>> index 02ee225f1..40336e679 100644
>> --- a/src/box/sql/select.c
>> +++ b/src/box/sql/select.c
>> @@ -1427,6 +1427,8 @@ sql_expr_list_to_key_info(struct Parse *parse, struct ExprList *list, int start)
>>  		sql_expr_coll(parse, item->pExpr, &unused, &id);
>>  		part->coll_id = id;
>>  		part->sort_order = item->sort_order;
>> +		enum affinity_type aff = sqlite3ExprAffinity(item->pExpr);
>> +		part->type =sql_affinity_to_field_type(aff);
> 
> 3. Lost white-space after '=‘.

Fixed:

diff --git a/src/box/sql/select.c b/src/box/sql/select.c
index 40336e679..fa0fea6c8 100644
--- a/src/box/sql/select.c
+++ b/src/box/sql/select.c
@@ -1428,7 +1428,7 @@ sql_expr_list_to_key_info(struct Parse *parse, struct ExprList *list, int start)
                part->coll_id = id;
                part->sort_order = item->sort_order;
                enum affinity_type aff = sqlite3ExprAffinity(item->pExpr);
-               part->type =sql_affinity_to_field_type(aff);
+               part->type = sql_affinity_to_field_type(aff);

> 
>>  	}
>>  	return key_info;
>>  }
>> @@ -5789,8 +5790,11 @@ sqlite3Select(Parse * pParse,		/* The parser context */
>>  	/* If the output is destined for a temporary table, open that table.
>>  	 */
>>  	if (pDest->eDest == SRT_EphemTab) {
>> -		sqlite3VdbeAddOp2(v, OP_OpenTEphemeral, pDest->reg_eph,
>> -				  pEList->nExpr + 1);
>> +		struct sql_key_info *key_info =
>> +			sql_expr_list_to_key_info(pParse, pEList, 0);
>> +		sqlite3VdbeAddOp4(v, OP_OpenTEphemeral, pDest->reg_eph,
>> +				  pEList->nExpr + 1, 0, (char *)key_info,
>> +				  P4_KEYINFO);
>>  		sqlite3VdbeAddOp3(v, OP_IteratorOpen, pDest->iSDParm, 0,
>>  				  pDest->reg_eph);
> 
> 4. Why can some ephemeral spaces be created without key_info? I grepped
> by OP_OpenTEphemeral and saw that somewhere it is still Op2, not Op4.

If there is no key_info, then default one is used instead.
See sql_ephemeral_space_create and how we cope
with situation when key_info == NULL.

> 
>>  diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
>> index e6b413c70..4345af24e 100644
>> --- a/src/box/sql/vdbe.c
>> +++ b/src/box/sql/vdbe.c
>> @@ -2173,7 +2167,7 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
>>  	} else {
>>  		/* Neither operand is NULL.  Do a comparison. */
>>  		affinity = pOp->p5 & AFFINITY_MASK;
>> -		if (affinity>=AFFINITY_NUMERIC) {
>> +		if (affinity>=AFFINITY_INTEGER) {
> 
> 5. I guess, we have sqlite3IsNumericAffinity for this.

Anyway I replace it with sql_affinity_is_numeric in
sql: clean-up affinity from SQL source code patch.

  parent reply	other threads:[~2019-01-16 14:26 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-28  9:34 [tarantool-patches] [PATCH 0/8] Eliminate affinity from source code Nikita Pettik
2018-12-28  9:34 ` [tarantool-patches] [PATCH 1/8] sql: remove SQLITE_ENABLE_UPDATE_DELETE_LIMIT define Nikita Pettik
2018-12-29 17:42   ` [tarantool-patches] " Vladislav Shpilevoy
2019-01-16 14:25     ` n.pettik
2018-12-28  9:34 ` [tarantool-patches] [PATCH 2/8] sql: use field type instead of affinity for type_def Nikita Pettik
2018-12-29 17:42   ` [tarantool-patches] " Vladislav Shpilevoy
2019-01-16 14:26     ` n.pettik
2018-12-28  9:34 ` [tarantool-patches] [PATCH 3/8] sql: remove numeric affinity Nikita Pettik
2018-12-29  9:01   ` [tarantool-patches] " Konstantin Osipov
2018-12-29 17:42   ` Vladislav Shpilevoy
2019-01-09  8:26     ` Konstantin Osipov
2019-01-16 14:26     ` n.pettik [this message]
2019-01-22 15:41       ` Vladislav Shpilevoy
2019-01-28 16:39         ` n.pettik
2019-01-30 13:04           ` Vladislav Shpilevoy
2019-02-01 16:39             ` n.pettik
2019-01-09  8:20   ` Konstantin Osipov
2018-12-28  9:34 ` [tarantool-patches] [PATCH 4/8] sql: replace affinity with field type for func Nikita Pettik
2018-12-28  9:34 ` [tarantool-patches] [PATCH 5/8] sql: replace field type with affinity for VDBE runtime Nikita Pettik
2018-12-29 17:42   ` [tarantool-patches] " Vladislav Shpilevoy
2019-01-16 14:26     ` n.pettik
2019-01-22 15:41       ` Vladislav Shpilevoy
2019-01-28 16:39         ` n.pettik
2019-01-30 13:04           ` Vladislav Shpilevoy
2019-02-01 16:39             ` n.pettik
2019-02-05 15:08               ` Vladislav Shpilevoy
2019-02-05 17:46                 ` n.pettik
2018-12-28  9:34 ` [tarantool-patches] [PATCH 6/8] sql: replace affinity with field type in struct Expr Nikita Pettik
2018-12-29 17:42   ` [tarantool-patches] " Vladislav Shpilevoy
2019-01-16 14:26     ` n.pettik
2019-01-22 15:41       ` Vladislav Shpilevoy
2019-01-28 16:39         ` n.pettik
2019-01-30 13:04           ` Vladislav Shpilevoy
2019-02-01 16:39             ` n.pettik
2019-02-05 15:08               ` Vladislav Shpilevoy
2019-02-05 17:46                 ` n.pettik
2018-12-28  9:34 ` [tarantool-patches] [PATCH 7/8] sql: clean-up affinity from SQL source code Nikita Pettik
2018-12-29 17:42   ` [tarantool-patches] " Vladislav Shpilevoy
2019-01-16 14:26     ` n.pettik
2019-01-22 15:41       ` Vladislav Shpilevoy
2019-01-28 16:40         ` n.pettik
2019-01-30 13:04           ` Vladislav Shpilevoy
2019-02-01 16:39             ` n.pettik
2019-02-05 15:08               ` Vladislav Shpilevoy
2019-02-05 17:46                 ` n.pettik
2018-12-28  9:34 ` [tarantool-patches] [PATCH 8/8] Remove affinity from field definition Nikita Pettik
2019-02-05 19:41 ` [tarantool-patches] Re: [PATCH 0/8] Eliminate affinity from source code Vladislav Shpilevoy
2019-02-08 13:37 ` Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AAE8BC06-EFD7-467A-ADF1-7369837B277A@tarantool.org \
    --to=korablev@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='[tarantool-patches] Re: [PATCH 3/8] sql: remove numeric affinity' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox