[tarantool-patches] Re: [PATCH 3/8] sql: remove numeric affinity

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Jan 22 18:41:34 MSK 2019


Hi! Thanks for the fixes!

On 16/01/2019 17:26, n.pettik wrote:
> 
>>> Numeric affinity in SQLite means the same as real, except that it
>>> forces integer values into floating point representation in case
>>> it can be converted without loss (e.g. 2.0 -> 2).
>>> Since in Tarantool core there is no difference between numeric and real
>>> values (both are stored as values of type NUMBER), lets remove numeric
>>> affinity and use instead real.
>>> The only real pitfall is implicit conversion mentioned above.
>>> What is more, vinyl engine complicates problem since it relies
>>> on data encoding (i.e. whether it is encoded as MP_INT or MP_FLOAT).
>>> For instance, if we encode 1.0 as MP_FLOAT during insertion, we won't
>>> be able to use iterators from Lua, since they implicitly change type of
>>> 1.0 and pass it to the iterator as MP_INT.  Solution to this problem is
>>> simple: lets always attempt at encoding floats as ints if conversion
>>> takes place without loss. This is a straightforward approach, but to
>>> implement it we need to care about reversed (decoding) situation.
>>
>> The bug is confirmed: https://github.com/tarantool/tarantool/issues/3907
>>
>> I agree with Kostja - it is just a bug, that Vinyl treats differently
>> integers and their double casts. It should not affect design decisions
>> of this patchset.
> 
> Ok, we may consider part of this patch as a workaround.
> Now affinity removal would significantly help me to fix
> other issues connected with strict typing (e.g. implicit casts).
> afterwards code introduced in this commit may be simplified/removed.
> I don’t know what priority of #3907 issue (milestone is 2.1.1 but
> we know that sometimes it may take a while).

As we agreed, it is just a Vinyl bug, and now this part of the
commit message looks strange:

"
     The only real pitfall is implicit conversion mentioned above.
     What is more, vinyl engine complicates problem since it relies
     on data encoding (i.e. whether it is encoded as MP_INT or MP_FLOAT).
     For instance, if we encode 1.0 as MP_FLOAT during insertion, we won't
     be able to use iterators from Lua, since they implicitly change type of
     1.0 and pass it to the iterator as MP_INT.
"

Here you've stated that Vinyl even for number indexes sees a difference
between 2.0 and 2, but it is wrong (in an ideal world, but in our it is
just a bug). It is better to write here about not a temporary bug, but
about, for instance, the example I've showed you below.

> 
>> But there is another reason why we can't pass *.0 as an iterator value -
>> our fast comparators (TupleCompare, TupleCompareWithKey) are designed to
>> work with only values of same MP_ type. They do not use slow
>> tuple_compare_field() which is able to compare double and integer.
> 
> Yep, it is sad. I can’t say anything more now,I need to "think about it”.

Please, use this example or find another to support your decision
always 'integerifying' float numbers having zero fraction.

> 
> The only workaround I can come up right now with is to add to
> every OP_Found/OP_Seek etc opcode OP_PromoteType.
> It would attempt at fetching space from cursor and gently
> applying types from format to record to be passed to iterator.
> But this doesn’t seem to be acceptable solution for many reasons.

I think, the implementation in this patch is ok. It is not
necessary to create new opcodes, nor think up another solution.

> 
> Otherwise, I have no idea how to determine required conversions
> *.0 -> * without execution context.

It is impossible of course. You can not learn what a value you
will fetch from cursors before VDBE execution.

> 
>> Moreover, I think, we should forbid implicit *.0 -> * This is why we
>> designed strict typing, isn't it?
> 
> It is too strict rule, ANSI allows this kind of conversion.
> The only restriction is that there shouldn’t be precision loss
> during conversion. It has been already discussed with
> Konstantin and Peter Gulutzan, I inline part of that thread
> (Topic “[dev] Re: Casts")

Understood, ok. Thanks for the investigation.

>>> diff --git a/src/box/sql.c b/src/box/sql.c
>>> index a06c50dca..a498cd8fe 100644
>>> --- a/src/box/sql.c
>>> +++ b/src/box/sql.c
>>> @@ -376,14 +376,18 @@ sql_ephemeral_space_create(uint32_t field_count, struct sql_key_info *key_info)
>>>   	for (uint32_t i = 0; i < field_count; ++i) {
>>>   		struct key_part_def *part = &ephemer_key_parts[i];
>>>   		part->fieldno = i;
>>> -		part->type = FIELD_TYPE_SCALAR;
>>>   		part->nullable_action = ON_CONFLICT_ACTION_NONE;
>>>   		part->is_nullable = true;
>>>   		part->sort_order = SORT_ORDER_ASC;
>>> -		if (def != NULL && i < def->part_count)
>>> +		if (def != NULL && i < def->part_count) {
>>> +			assert(def->parts[i].type < field_type_MAX);
>>> +			part->type = def->parts[i].type != FIELD_TYPE_ANY ?
>>> +				     def->parts[i].type : FIELD_TYPE_SCALAR;
>>>   			part->coll_id = def->parts[i].coll_id;
>>
>> 1. How can key_part have FIELD_TYPE_ANY? We have no comparators for ANY
>> type, it is impossible, isn't it?
> 
> We don’t, and that is why I replace ANY with SCALAR.

No, you still check for "def->parts[i].type != FIELD_TYPE_ANY", and I
can not understand how is it possible. struct key_def can not have
FIELD_TYPE_ANY in its parts.

> 
>>
>>> -		else
>>> +		} else {
>>>   			part->coll_id = COLL_NONE;
>>> +			part->type = FIELD_TYPE_SCALAR;
>>> +		}
>>>   	}
>>>   	struct key_def *ephemer_key_def = key_def_new(ephemer_key_parts,
>>>   						      field_count);




More information about the Tarantool-patches mailing list