[Tarantool-discussions] Implicit cast for COMPARISON

Konstantin Osipov kostja.osipov at gmail.com
Mon Feb 3 18:02:22 MSK 2020


* Mergen Imeev <imeevma at tarantool.org> [20/02/03 14:39]:
> > > On 2020-01-21 11:09 a.m., Peter Gulutzan wrote:
> > > <cut>
> > > > I think that the proposed change is good.
> > > I withdraw that remark. I misunderstood the proposal.
> > 
> > I side with PeterG.
> > 
> > -- 
> > Konstantin Osipov, Moscow, Russia
> 
> Hi,
> 
> Let me clarify: you think that during comparison it makes sense
> that STRING is implicitly converted to numbers?


The issue is not with any specific problem, which may require a
solution, the issue is with how the solution is packaged in your
proposal.


Shall strings be implicitly converted to numbers or numbers
implicitly converted to strings during comparison? Probably not.
Especially, if, as you pointed out, there are other rules for
SCALAR or ANY bags of types, for which implicit conversion will
not taking place.


Shall we use the rules you suggested? I am afraid not, first of
all because it's unclear whether this is the end of the story.

While PeterG, I believe, would advocate strict adherence to ANSI,
my take is that we should not give existing postgresql or oracle
users too much trouble here. The reason I don't include MySQL on
the list is that I know that mysql has historically been way more
messy with its implicit conversion rules, so I'd not copy it here. 

Can we begin fixing this one thing at a time? This would help
everyone quicker find balance between these priorities.
E.g. can we make
sure that all numeric types are comparable (as it seems already
everyone's desire), comparison rules are
clear to everyone and everyone agrees with them, fix all bugs and
cover it with tests?

Then we can do the same with string/blob types. 

Then we can look at comparison and implicit conversion rules
between type families, and also carefully study existing vendors,
so we don't break too much existing PostgreSQL or MySQL queries
with more strict comparison rules or funny comparison order.

> If so, then let's think about it. I think we have some questions
> to discuss in this case:
> 1) I think it makes sense that numeric types can be compared
> without any conversion. Do you agree? We have a special function
> that implements a comparison between integers and floating point
> numbers. If you do not agree with me, then make your suggestion.

I agree it makes sense, but to be on the safe side, I'd also check 
how existing vendors handle it - so that we don't introduce some
obvious inconvenience.

> 2) STRING can be implicitly cast to a number. In case it can be
> cast to INTEGER, it will be INTEGER. In case it can be cast to
> DOUBLE, it will be DOUBLE. Do you agree? Should we return to this
> issue after the implementation of DECIMAL?

I don't know. I'd say DECIMAL is a game changer. I'd say nothing
should be implicitly cast to floating point type once there is an
exact type.
Besides, there are collate/locale issues. Are we going to handle
"1 234 567"? What about "1,234,567"?

> 3) Can STRING be implicitly cast to BOOLEAN?

Look at vendors and ANSI, deal with one problem at a time. It's
not a major issue. The biggest issue is string -> number, and 
it's best to deal with it after decimal is part of SQL.

> 4) Can STRING be implicitly cast to BINARY?

Ditto, look at vendors and ANSI, move into an own patch..

The same applies  to the rest of your questions: there is no user
value in us trying to provide what we believe is a "better"
semantics over the existing vendors and ansi in such small
details. Even if we think it's safe. 


> 5) In case STRING cannot be implicitly cast to the type of the
> other operand, should we allow implicitly casting the other
> operand to STRING? For example, from "'123r' > 124" move to
> "'123r' > '124'"?


> 6) Do you agree that only implicit casting from/to STRING is
> allowed? I mean that nothing else can be implicitly cast during a
> comparison with any other type if one of the types does not
> contain the other.
> 7) We need to clarify the rules when comparing SCALAR values. I
> think we cannot use the Tarantool rules here, as the Tarantool
> rules indicate that “100 < '2' == true”, but we decided that
> "100 > '2' == true", since '2' implicitly cast to 2. Could you
> suggest the rules that we should use here?
> 

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-discussions mailing list