[tarantool-patches] Re: [PATCH 3/8] sql: remove numeric affinity

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sat Dec 29 20:42:22 MSK 2018


Thanks for the patch! See 5 comments below.

On 28/12/2018 12:34, Nikita Pettik wrote:
> Numeric affinity in SQLite means the same as real, except that it
> forces integer values into floating point representation in case
> it can be converted without loss (e.g. 2.0 -> 2).
> Since in Tarantool core there is no difference between numeric and real
> values (both are stored as values of type NUMBER), lets remove numeric
> affinity and use instead real.
> 
> The only real pitfall is implicit conversion mentioned above.
> What is more, vinyl engine complicates problem since it relies
> on data encoding (i.e. whether it is encoded as MP_INT or MP_FLOAT).
> For instance, if we encode 1.0 as MP_FLOAT during insertion, we won't
> be able to use iterators from Lua, since they implicitly change type of
> 1.0 and pass it to the iterator as MP_INT.  Solution to this problem is
> simple: lets always attempt at encoding floats as ints if conversion
> takes place without loss. This is a straightforward approach, but to
> implement it we need to care about reversed (decoding) situation.

The bug is confirmed: https://github.com/tarantool/tarantool/issues/3907

I agree with Kostja - it is just a bug, that Vinyl treats differently
integers and their double casts. It should not affect design decisions
of this patchset.

But there is another reason why we can't pass *.0 as an iterator value -
our fast comparators (TupleCompare, TupleCompareWithKey) are designed to
work with only values of same MP_ type. They do not use slow
tuple_compare_field() which is able to compare double and integer.

Moreover, I think, we should forbid implicit *.0 -> * This is why we
designed strict typing, isn't it?

If a table has column int, we should not be able to select it by double
values without explicit cast, even if implicit is possible without loss.
For explicit cast users have CAST operator.

What do you think?

> 
> OP_Column fetches from msgpack field with given number and stores it as
> a native VDBE memory object. Type of that memory is based on type of
> msgpack value. So, if space field is of type NUMBER and holds value 1,
> type of VDBE memory will be INT (after decoding), not float 1.0.  As a
> result, further calculations may be wrong: for instance, instead of
> floating point division, we could get integer division.  To cope with
> this problem, lets add auxiliary conversion to decoding routine which
> uses space format of tuple to be decoded. It is worth mentioning that
> ephemeral spaces don't feature space format, so we are going to rely on
> type of key parts. Finally, internal VDBE merge sorter also operates on
> entries encoded into msgpack. To fix this case, we check type of
> ORDER BY/GROUP BY arguments: if they are of type float, we are emitting
> additional opcode OP_AffinityReal to force float type after encoding.
> 
> Part of #3698
> ---
>   src/box/field_def.h                  |  1 -
>   src/box/lua/lua_sql.c                |  2 +-
>   src/box/sql.c                        | 10 +++++++---
>   src/box/sql/build.c                  |  1 -
>   src/box/sql/expr.c                   | 20 +++++++++++---------
>   src/box/sql/select.c                 | 10 +++++++---
>   src/box/sql/sqliteInt.h              |  2 +-
>   src/box/sql/vdbe.c                   | 26 ++++++++++++++++++--------
>   src/box/sql/vdbeaux.c                | 19 ++++++++++++++++---
>   src/box/sql/vdbemem.c                |  7 ++-----
>   test/sql-tap/tkt-80e031a00f.test.lua | 12 ++++++------
>   11 files changed, 69 insertions(+), 41 deletions(-)
> 
> diff --git a/src/box/sql.c b/src/box/sql.c
> index a06c50dca..a498cd8fe 100644
> --- a/src/box/sql.c
> +++ b/src/box/sql.c
> @@ -376,14 +376,18 @@ sql_ephemeral_space_create(uint32_t field_count, struct sql_key_info *key_info)
>   	for (uint32_t i = 0; i < field_count; ++i) {
>   		struct key_part_def *part = &ephemer_key_parts[i];
>   		part->fieldno = i;
> -		part->type = FIELD_TYPE_SCALAR;
>   		part->nullable_action = ON_CONFLICT_ACTION_NONE;
>   		part->is_nullable = true;
>   		part->sort_order = SORT_ORDER_ASC;
> -		if (def != NULL && i < def->part_count)
> +		if (def != NULL && i < def->part_count) {
> +			assert(def->parts[i].type < field_type_MAX);
> +			part->type = def->parts[i].type != FIELD_TYPE_ANY ?
> +				     def->parts[i].type : FIELD_TYPE_SCALAR;
>   			part->coll_id = def->parts[i].coll_id;

1. How can key_part have FIELD_TYPE_ANY? We have no comparators for ANY
type, it is impossible, isn't it?

> -		else
> +		} else {
>   			part->coll_id = COLL_NONE;
> +			part->type = FIELD_TYPE_SCALAR;
> +		}
>   	}
>   	struct key_def *ephemer_key_def = key_def_new(ephemer_key_parts,
>   						      field_count);
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index b67b22c23..7a0b929a7 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -3700,6 +3698,10 @@ sqlite3ExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
>   				sqlite3VdbeAddOp3(v, OP_Column,
>   						  pAggInfo->sortingIdxPTab,
>   						  pCol->iSorterColumn, target);
> +				if (pCol->space_def->fields[pExpr->iAgg].type ==
> +				    FIELD_TYPE_NUMBER)
> +					sqlite3VdbeAddOp1(v, OP_RealAffinity,
> +							  target);

2. Please, use {} for multi-line 'if' body.

>   				return target;
>   			}
>   			/* Otherwise, fall thru into the TK_COLUMN case */
> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
> index 02ee225f1..40336e679 100644
> --- a/src/box/sql/select.c
> +++ b/src/box/sql/select.c
> @@ -1427,6 +1427,8 @@ sql_expr_list_to_key_info(struct Parse *parse, struct ExprList *list, int start)
>   		sql_expr_coll(parse, item->pExpr, &unused, &id);
>   		part->coll_id = id;
>   		part->sort_order = item->sort_order;
> +		enum affinity_type aff = sqlite3ExprAffinity(item->pExpr);
> +		part->type =sql_affinity_to_field_type(aff);

3. Lost white-space after '='.

>   	}
>   	return key_info;
>   }
> @@ -5789,8 +5790,11 @@ sqlite3Select(Parse * pParse,		/* The parser context */
>   	/* If the output is destined for a temporary table, open that table.
>   	 */
>   	if (pDest->eDest == SRT_EphemTab) {
> -		sqlite3VdbeAddOp2(v, OP_OpenTEphemeral, pDest->reg_eph,
> -				  pEList->nExpr + 1);
> +		struct sql_key_info *key_info =
> +			sql_expr_list_to_key_info(pParse, pEList, 0);
> +		sqlite3VdbeAddOp4(v, OP_OpenTEphemeral, pDest->reg_eph,
> +				  pEList->nExpr + 1, 0, (char *)key_info,
> +				  P4_KEYINFO);
>   		sqlite3VdbeAddOp3(v, OP_IteratorOpen, pDest->iSDParm, 0,
>   				  pDest->reg_eph);

4. Why can some ephemeral spaces be created without key_info? I grepped
by OP_OpenTEphemeral and saw that somewhere it is still Op2, not Op4.

>   
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index e6b413c70..4345af24e 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -2173,7 +2167,7 @@ case OP_Ge: {             /* same as TK_GE, jump, in1, in3 */
>   	} else {
>   		/* Neither operand is NULL.  Do a comparison. */
>   		affinity = pOp->p5 & AFFINITY_MASK;
> -		if (affinity>=AFFINITY_NUMERIC) {
> +		if (affinity>=AFFINITY_INTEGER) {

5. I guess, we have sqlite3IsNumericAffinity for this.

>   			if ((flags1 | flags3)&MEM_Str) {
>   				if ((flags1 & (MEM_Int|MEM_Real|MEM_Str))==MEM_Str) {
>   					applyNumericAffinity(pIn1,0);




More information about the Tarantool-patches mailing list