Tarantool discussions archive
 help / color / mirror / Atom feed
* [Tarantool-discussions] Implicit cast in SQL
@ 2020-01-11 13:33 Mergen Imeev
  2020-01-11 21:49 ` Peter Gulutzan
  0 siblings, 1 reply; 3+ messages in thread
From: Mergen Imeev @ 2020-01-11 13:33 UTC (permalink / raw)
  To: tarantool-discussions

Hi all,
I have a question to discuss. I know that we have discussed this
issue quite a few times. But, since we have a few open issues on
github, I think the final answer has not yet been found. I want to
completely close the issue this time.

I would like to start with an implicit cast. Implicit cast
consists of implicit type conversion for COMPARISON and implicit
type conversion for ASSIGNMENT.


Currently, all types can be divided into four categories:
numeric (UNSIGNED, INTEGER, NUMBER, DOUBLE), string (STRING),
binary (VARBINARY) and boolean (BOOLEAN). We also have a scalar
type, but any value of a scalar type has some subtype that can be
placed in one of these four categories.

So, I suggest this:
Implicit type conversion for COMPARISON:
1) types of the same category can be implicitly cast to each
other;
2) types of different categories cannot be implicitly cast to
each other.

Implicit type conversion for ASSIGNMENT:
1) If a user-defined cast function (UDCF) is defined, then this
function is used;
2) If UDCF is not defined, then types of the same category can be
implicitly cast to each other, and types of different categories
cannot be implicitly cast to each other.

I think this is what ANSI says.

We currently do not have such a thing as UDCF. I suggest to fill
an issue and use the same rules for ASSIGNMENT as for COMPARISON.
In fact, I think we can create a special option for assigning
UCDF during the rework of implicit type conversion for ASSIGNMENT.

What do you think about this?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Tarantool-discussions] Implicit cast in SQL
  2020-01-11 13:33 [Tarantool-discussions] Implicit cast in SQL Mergen Imeev
@ 2020-01-11 21:49 ` Peter Gulutzan
  2020-01-12 12:40   ` Mergen Imeev
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Gulutzan @ 2020-01-11 21:49 UTC (permalink / raw)
  To: Mergen Imeev, tarantool-discussions

Hi,

On 2020-01-11 6:33 a.m., Mergen Imeev wrote:
 > Hi all,
 > I have a question to discuss. I know that we have discussed this
 > issue quite a few times.

Very true! I thought that most of these matters were settled.
Some of the discussions were with K. Osipov or CCed to dev,
for example the long Re: [dev] cast thread.
Should they be on the email "To:" list?

 > But, since we have a few open issues on
 > github, I think the final answer has not yet been found.

I see two open issues on github say implicit cast:
#3809 sql: implicit type cast problems during assignment 
https://github.com/tarantool/tarantool/issues/3809
#4230 sql: implicit type cast during comparison 
https://github.com/tarantool/tarantool/issues/4230
Are there others that I have missed?

 > I want to
 > completely close the issue this time.
 >
 > I would like to start with an implicit cast. Implicit cast
 > consists of implicit type conversion for COMPARISON and implicit
 > type conversion for ASSIGNMENT.

I will compare your suggestion with the chart that I wrote for
the documentation, which was the result of my understanding from
prior discussions. The documentation has not been published,
but can be seen in section Data type conversion, lines 1632 and 
following, in
https://github.com/tarantool/doc/blob/pgulutzan-2.3/doc/reference/reference_sql/sql.rst

 > Currently, all types can be divided into four categories:
 > numeric (UNSIGNED, INTEGER, NUMBER, DOUBLE), string (STRING),
 > binary (VARBINARY) and boolean (BOOLEAN). We also have a scalar
 > type, but any value of a scalar type has some subtype that can be
 > placed in one of these four categories.
 >

Please don't call them "subtypes". I proposed that we should regard
SCALAR as a supertype (and therefore the rules for the supertype
would apply for the subtype), but that was rejected.

 > So, I suggest this:
 > Implicit type conversion for COMPARISON:
 > 1) types of the same category can be implicitly cast to each
 > other;

Or "implicitly cast to the type whose range includes both values".

 > 2) types of different categories cannot be implicitly cast to
 > each other.
 >

The chart does not agree.

* The chart says A (always) numeric can be compared to string.
But I was fooled, I did not look in both directions.
Now S (sometimes) a numeric can be compared to a string.
For example, 1 = 'a' is legal, 'a' = 1 is illegal.
But anyway, it is currently not always true that types of different
categories cannot be implicitly cast to each other.

Incidentally, I believe this behaviour is bad, I argued against
implicit casting of numbers and strings for either comparison
or assignment. But other people, probably the majority,
argued that it was good, so it is the current behaviour.

Also, in SCALAR columns, comparisons of values of different
data types are allowed for the sake of ORDER BY (booleans
are less than strings etc.) but not allowed for the sake of
WHERE (searches for booleans cause an error if there are
strings in the column etc.). This is, I believe, a bad thing.

 > Implicit type conversion for ASSIGNMENT:
 > 1) If a user-defined cast function (UDCF) is defined, then this
 > function is used;
 > 2) If UDCF is not defined, then types of the same category can be
 > implicitly cast to each other, and types of different categories
 > cannot be implicitly cast to each other.
 >

The chart does not agree.

* The chart says A (always) numeric can be assigned to string,
and S (sometimes) string can be assigned to number.

* The chart says numeric to numeric implicit conversion is S
(sometimes allowed) not A (always allowed) because of range
differences, for example -1 will not fit in UNSIGNED.
Of course, you can reply that the cast is allowed but the
assignment is not allowed, if you prefer.

I have assumed that function calls may involve implicit assignment.

 > I think this is what ANSI says.
 >
 > We currently do not have such a thing as UDCF. I suggest to fill
 > an issue and use the same rules for ASSIGNMENT as for COMPARISON.
 > In fact, I think we can create a special option for assigning
 > UCDF during the rework of implicit type conversion for ASSIGNMENT.
 >
 > What do you think about this?
 >

When you say user-defined casts, do you really mean that
Tarantool's users will be given facilities that allow them
to write the code that handles comparison and assignment?
I think this might be hard, but will wait till I see the issue.

Peter Gulutzan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Tarantool-discussions] Implicit cast in SQL
  2020-01-11 21:49 ` Peter Gulutzan
@ 2020-01-12 12:40   ` Mergen Imeev
  0 siblings, 0 replies; 3+ messages in thread
From: Mergen Imeev @ 2020-01-12 12:40 UTC (permalink / raw)
  To: Peter Gulutzan; +Cc: tarantool-discussions

Hi,
thank you for answering.

On Sat, Jan 11, 2020 at 02:49:41PM -0700, Peter Gulutzan wrote:
> Hi,
> 
> On 2020-01-11 6:33 a.m., Mergen Imeev wrote:
> > Hi all,
> > I have a question to discuss. I know that we have discussed this
> > issue quite a few times.
> 
> Very true! I thought that most of these matters were settled.
> Some of the discussions were with K. Osipov or CCed to dev,
> for example the long Re: [dev] cast thread.
> Should they be on the email "To:" list?
> 
I remember this, I also remember that at some point it was
decided that "we don’t have time now, so let's postpone the
discussion to a later date." I think it was somewhere in
the thread "Check types of arguments of functions." So, I
think the time has come.

> > But, since we have a few open issues on
> > github, I think the final answer has not yet been found.
> 
> I see two open issues on github say implicit cast:
> #3809 sql: implicit type cast problems during assignment
> https://github.com/tarantool/tarantool/issues/3809
> #4230 sql: implicit type cast during comparison
> https://github.com/tarantool/tarantool/issues/4230
> Are there others that I have missed?
> 
I think that there is at least two more:
"#4159: sql: length takes not-string values"
https://github.com/tarantool/tarantool/issues/4159
"#4470: SQL arithmetic types are broken"
https://github.com/tarantool/tarantool/issues/4470

> > I want to
> > completely close the issue this time.
> >
> > I would like to start with an implicit cast. Implicit cast
> > consists of implicit type conversion for COMPARISON and implicit
> > type conversion for ASSIGNMENT.
> 
> I will compare your suggestion with the chart that I wrote for
> the documentation, which was the result of my understanding from
> prior discussions. The documentation has not been published,
> but can be seen in section Data type conversion, lines 1632 and following,
> in
> https://github.com/tarantool/doc/blob/pgulutzan-2.3/doc/reference/reference_sql/sql.rst
> 
> > Currently, all types can be divided into four categories:
> > numeric (UNSIGNED, INTEGER, NUMBER, DOUBLE), string (STRING),
> > binary (VARBINARY) and boolean (BOOLEAN). We also have a scalar
> > type, but any value of a scalar type has some subtype that can be
> > placed in one of these four categories.
> >
> 
> Please don't call them "subtypes". I proposed that we should regard
> SCALAR as a supertype (and therefore the rules for the supertype
> would apply for the subtype), but that was rejected.
> 
Ok, I won't call them 'subtypes' anymore.

> > So, I suggest this:
> > Implicit type conversion for COMPARISON:
> > 1) types of the same category can be implicitly cast to each
> > other;
> 
> Or "implicitly cast to the type whose range includes both values".
> 
> > 2) types of different categories cannot be implicitly cast to
> > each other.
> >
> 
> The chart does not agree.
> 
> * The chart says A (always) numeric can be compared to string.
> But I was fooled, I did not look in both directions.
> Now S (sometimes) a numeric can be compared to a string.
> For example, 1 = 'a' is legal, 'a' = 1 is illegal.
> But anyway, it is currently not always true that types of different
> categories cannot be implicitly cast to each other.
> 
> Incidentally, I believe this behaviour is bad, I argued against
> implicit casting of numbers and strings for either comparison
> or assignment. But other people, probably the majority,
> argued that it was good, so it is the current behaviour.
> 
> Also, in SCALAR columns, comparisons of values of different
> data types are allowed for the sake of ORDER BY (booleans
> are less than strings etc.) but not allowed for the sake of
> WHERE (searches for booleans cause an error if there are
> strings in the column etc.). This is, I believe, a bad thing.
> 
I suggest starting with a comparison. It is possible that
I previously thought that implicit casting is something
good. But not now. For some reason, I began to believe
that implicit casting is evil.

First, I would like to make some definitions. Let's say that type
A can be naturally cast to type B if and only if any value of type
A does not change the internal representation when casting to type
B.

Currently, in SQL, natural casts are used in the following three
situations:
1) UNSIGNED can be naturally cast to INTEGER;
2) any numeric type can be naturally cast to a NUMBER;
3) any scalar type can be naturally cast to a SCALAR.

So, keeping this definition in mind, I propose these rules for
comparison:
1) If both values are of numeric types, they can be compared.
2) If one value has a type that can be naturally cast to the type
of another value, then they can be compared.
3) If none of the previous rules can be applied to values, then
they cannot be compared.

So, for the type set currently available in SQL, I mean the
following:
1) You can compare two values of numeric types. No implicit cast
is required. Special function used.
2) A value of type BOOLEAN can only be compared with a value of
type BOOLEAN. No implicit cast is required.
3) A value of type STRING can only be compared with a value of
type STRING. No implicit cast is required.
4) A value of type VARBINARY can only be compared with a value of
type VARBINARY. No implicit cast is required.
5) The value of any scalar type can be compared with the value of
the SCALAR type. Here the first value is naturally cast to SCALAR.
6) In the other situations values cannot be compared with each
other.

If I'm not mistaken, something similar is used in the non-SQL part
of Tarantool.

What do you think about this?

> > Implicit type conversion for ASSIGNMENT:
> > 1) If a user-defined cast function (UDCF) is defined, then this
> > function is used;
> > 2) If UDCF is not defined, then types of the same category can be
> > implicitly cast to each other, and types of different categories
> > cannot be implicitly cast to each other.
> >
> 
> The chart does not agree.
> 
> * The chart says A (always) numeric can be assigned to string,
> and S (sometimes) string can be assigned to number.
> 
> * The chart says numeric to numeric implicit conversion is S
> (sometimes allowed) not A (always allowed) because of range
> differences, for example -1 will not fit in UNSIGNED.
> Of course, you can reply that the cast is allowed but the
> assignment is not allowed, if you prefer.
> 
> I have assumed that function calls may involve implicit assignment.
> 
> > I think this is what ANSI says.
> >
> > We currently do not have such a thing as UDCF. I suggest to fill
> > an issue and use the same rules for ASSIGNMENT as for COMPARISON.
> > In fact, I think we can create a special option for assigning
> > UCDF during the rework of implicit type conversion for ASSIGNMENT.
> >
> > What do you think about this?
> >
> 
> When you say user-defined casts, do you really mean that
> Tarantool's users will be given facilities that allow them
> to write the code that handles comparison and assignment?
I meant this, but only for assignment. Do we need any
implicit casts other than natural in comparison?

> I think this might be hard, but will wait till I see the issue.
> 
> Peter Gulutzan
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-12 12:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-11 13:33 [Tarantool-discussions] Implicit cast in SQL Mergen Imeev
2020-01-11 21:49 ` Peter Gulutzan
2020-01-12 12:40   ` Mergen Imeev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox