[tarantool-patches] Re: [PATCH 3/8] sql: remove numeric affinity
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Tue Jan 22 18:41:34 MSK 2019
Hi! Thanks for the fixes!
On 16/01/2019 17:26, n.pettik wrote:
>
>>> Numeric affinity in SQLite means the same as real, except that it
>>> forces integer values into floating point representation in case
>>> it can be converted without loss (e.g. 2.0 -> 2).
>>> Since in Tarantool core there is no difference between numeric and real
>>> values (both are stored as values of type NUMBER), lets remove numeric
>>> affinity and use instead real.
>>> The only real pitfall is implicit conversion mentioned above.
>>> What is more, vinyl engine complicates problem since it relies
>>> on data encoding (i.e. whether it is encoded as MP_INT or MP_FLOAT).
>>> For instance, if we encode 1.0 as MP_FLOAT during insertion, we won't
>>> be able to use iterators from Lua, since they implicitly change type of
>>> 1.0 and pass it to the iterator as MP_INT. Solution to this problem is
>>> simple: lets always attempt at encoding floats as ints if conversion
>>> takes place without loss. This is a straightforward approach, but to
>>> implement it we need to care about reversed (decoding) situation.
>>
>> The bug is confirmed: https://github.com/tarantool/tarantool/issues/3907
>>
>> I agree with Kostja - it is just a bug, that Vinyl treats differently
>> integers and their double casts. It should not affect design decisions
>> of this patchset.
>
> Ok, we may consider part of this patch as a workaround.
> Now affinity removal would significantly help me to fix
> other issues connected with strict typing (e.g. implicit casts).
> afterwards code introduced in this commit may be simplified/removed.
> I don’t know what priority of #3907 issue (milestone is 2.1.1 but
> we know that sometimes it may take a while).
As we agreed, it is just a Vinyl bug, and now this part of the
commit message looks strange:
"
The only real pitfall is implicit conversion mentioned above.
What is more, vinyl engine complicates problem since it relies
on data encoding (i.e. whether it is encoded as MP_INT or MP_FLOAT).
For instance, if we encode 1.0 as MP_FLOAT during insertion, we won't
be able to use iterators from Lua, since they implicitly change type of
1.0 and pass it to the iterator as MP_INT.
"
Here you've stated that Vinyl even for number indexes sees a difference
between 2.0 and 2, but it is wrong (in an ideal world, but in our it is
just a bug). It is better to write here about not a temporary bug, but
about, for instance, the example I've showed you below.
>
>> But there is another reason why we can't pass *.0 as an iterator value -
>> our fast comparators (TupleCompare, TupleCompareWithKey) are designed to
>> work with only values of same MP_ type. They do not use slow
>> tuple_compare_field() which is able to compare double and integer.
>
> Yep, it is sad. I can’t say anything more now,I need to "think about it”.
Please, use this example or find another to support your decision
always 'integerifying' float numbers having zero fraction.
>
> The only workaround I can come up right now with is to add to
> every OP_Found/OP_Seek etc opcode OP_PromoteType.
> It would attempt at fetching space from cursor and gently
> applying types from format to record to be passed to iterator.
> But this doesn’t seem to be acceptable solution for many reasons.
I think, the implementation in this patch is ok. It is not
necessary to create new opcodes, nor think up another solution.
>
> Otherwise, I have no idea how to determine required conversions
> *.0 -> * without execution context.
It is impossible of course. You can not learn what a value you
will fetch from cursors before VDBE execution.
>
>> Moreover, I think, we should forbid implicit *.0 -> * This is why we
>> designed strict typing, isn't it?
>
> It is too strict rule, ANSI allows this kind of conversion.
> The only restriction is that there shouldn’t be precision loss
> during conversion. It has been already discussed with
> Konstantin and Peter Gulutzan, I inline part of that thread
> (Topic “[dev] Re: Casts")
Understood, ok. Thanks for the investigation.
>>> diff --git a/src/box/sql.c b/src/box/sql.c
>>> index a06c50dca..a498cd8fe 100644
>>> --- a/src/box/sql.c
>>> +++ b/src/box/sql.c
>>> @@ -376,14 +376,18 @@ sql_ephemeral_space_create(uint32_t field_count, struct sql_key_info *key_info)
>>> for (uint32_t i = 0; i < field_count; ++i) {
>>> struct key_part_def *part = &ephemer_key_parts[i];
>>> part->fieldno = i;
>>> - part->type = FIELD_TYPE_SCALAR;
>>> part->nullable_action = ON_CONFLICT_ACTION_NONE;
>>> part->is_nullable = true;
>>> part->sort_order = SORT_ORDER_ASC;
>>> - if (def != NULL && i < def->part_count)
>>> + if (def != NULL && i < def->part_count) {
>>> + assert(def->parts[i].type < field_type_MAX);
>>> + part->type = def->parts[i].type != FIELD_TYPE_ANY ?
>>> + def->parts[i].type : FIELD_TYPE_SCALAR;
>>> part->coll_id = def->parts[i].coll_id;
>>
>> 1. How can key_part have FIELD_TYPE_ANY? We have no comparators for ANY
>> type, it is impossible, isn't it?
>
> We don’t, and that is why I replace ANY with SCALAR.
No, you still check for "def->parts[i].type != FIELD_TYPE_ANY", and I
can not understand how is it possible. struct key_def can not have
FIELD_TYPE_ANY in its parts.
>
>>
>>> - else
>>> + } else {
>>> part->coll_id = COLL_NONE;
>>> + part->type = FIELD_TYPE_SCALAR;
>>> + }
>>> }
>>> struct key_def *ephemer_key_def = key_def_new(ephemer_key_parts,
>>> field_count);
More information about the Tarantool-patches
mailing list