* [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description. @ 2020-06-22 11:52 Mergen Imeev 2020-06-22 18:43 ` Peter Gulutzan 0 siblings, 1 reply; 6+ messages in thread From: Mergen Imeev @ 2020-06-22 11:52 UTC (permalink / raw) To: pgulutzan, tarantool-discussions, korablev, Vladislav Shpilevoy, tsafin Hello Peter! I have two questions. The first. I am sorry to ask about this again, but I want to discuss once again the implicit cast for assignment between numeric types. I wrote earlier that I plan to use C rules for implicit casting. However, after implementation, this looks terribly wrong for me. Temporarily, I decided to allow implicit casting only if the number after conversion from one type to another using C rules and back remains unchanged. For example, a double 50.0 can be implicitly cast to an integer 50, but a double 50.5 cannot be cast to an integer, and an integer 2^60-1 cannot be cast to a double. However, I am not sure if this is correct. Could you suggest rules for implicit casting to assign numeric types? The second question is about type mismatch error text. We currently have two types of type mismatch errors. The first: - 'Type mismatch: can not convert some_text to integer' The second: - 'Type mismatch: can not convert text to integer' In the first case, we use the value in the error description, in the second we use the type. Now we have decided on the same opinion. We have 3 options: 1) use value in error description: Type mismatch: can not convert 'some_text' to integer 2) use type in error description: Type mismatch: can not convert text to integer 3) use both in error description: Type mismatch: can not convert 'some_text'(text) to integer The first and third options look a little unsafe, since the values will be shown in the logs. However, they provide more information than the second option. Which one do you think is the best? Or maybe you can offer another option? In addition, we must remember that at some point the description becomes the same as in Lua (issue #5074): 'Tuple field 1 type does not match one required by operation: expected integer' ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description. 2020-06-22 11:52 [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description Mergen Imeev @ 2020-06-22 18:43 ` Peter Gulutzan 2020-06-23 16:15 ` Mergen Imeev 0 siblings, 1 reply; 6+ messages in thread From: Peter Gulutzan @ 2020-06-22 18:43 UTC (permalink / raw) To: Mergen Imeev, tarantool-discussions, korablev, Vladislav Shpilevoy, tsafin Hi, On 2020-06-22 5:52 a.m., Mergen Imeev wrote: > Hello Peter! > > I have two questions. > > The first. I am sorry to ask about this again, but I want to > discuss once again the implicit cast for assignment between > numeric types. I wrote earlier that I plan to use C rules for > implicit casting. However, after implementation, this looks > terribly wrong for me. Temporarily, I decided to allow implicit > casting only if the number after conversion from one type to > another using C rules and back remains unchanged. For example, a > double 50.0 can be implicitly cast to an integer 50, but a > double 50.5 cannot be cast to an integer, and an integer 2^60-1 > cannot be cast to a double. > > However, I am not sure if this is correct. Could you suggest rules > for implicit casting to assign numeric types? > Re "double 50.5" On 2020-04-30 in thread (CCed to several people and to dev) "Re: [Tarantool-discussions] Implicit cast for ASSIGNMENT" I read: Mergen Imeev> I think that the rules for implicit casting when assigning value Mergen Imeev> of numeric type must be the same as in C. Peter Gulutzan> You mean there should be truncation not rounding? Mergen Imeev> Yes. Therefore I expected box.execute([[CREATE TABLE t0 (s1 INTEGER PRIMARY KEY);]]) box.execute([[INSERT INTO t0 VALUES (?);]], {50.5}) would be okay, result = 50. Re "integer 2^60-1": Currently I can say: box.execute([[CREATE TABLE t (s1 DOUBLE PRIMARY KEY);]]) box.execute([[INSERT INTO t VALUES (?);]], 2^60 - 1) and is okay today, result = 2^60 - 1 (I think). I suggest the rules that were discussed earlier: Loss of significant digits (overflow) is a runtime error. Loss of insignificant digits is not an error. I think this should be possible because explicit casts work: box.execute([[SELECT CAST(? AS INTEGER);]],{2^64}) -- result = error box.execute([[SELECT CAST(50.5 AS INTEGER);]]) -- result = 50 > The second question is about type mismatch error text. > > We currently have two types of type mismatch errors. > > The first: > - 'Type mismatch: can not convert some_text to integer' > > The second: > - 'Type mismatch: can not convert text to integer' > > In the first case, we use the value in the error description, in > the second we use the type. > > Now we have decided on the same opinion. > > We have 3 options: > 1) use value in error description: > Type mismatch: can not convert 'some_text' to integer > 2) use type in error description: > Type mismatch: can not convert text to integer > 3) use both in error description: > Type mismatch: can not convert 'some_text'(text) to integer > > The first and third options look a little unsafe, since the values > will be shown in the logs. However, they provide more information > than the second option. > > Which one do you think is the best? Or maybe you can offer another > option? > > In addition, we must remember that at some point the description > becomes the same as in Lua (issue #5074): > 'Tuple field 1 type does not match one required by operation: > expected integer' > Re 'Type mismatch: can not convert text to integer': You use "type", but that is not as specific as it could be. Why not "data type"? You use "can not", but in other error messages I see "cannot", for example "cannot create trigger on system table" (trigger.c) Why be inconsistent? You use "text", but that is a synonym, not what typeof() returns. Why not "string"? You use "convert", but look at the title of this email thread. Why not "implicitly cast"? Actually "conversion" is better now, it is in the manual. But I could change the manual if people wanted "cast". Re 'Type mismatch: can not convert some_text to integer' I do not know what your plan is, when some_text is extremely long, or when the value has non-displayable bytes / characters. I do not know what your plan is, when the source is not is not a literal value, but instead is ? or is a column name or is a function result. I guess the answer is still "the value", but I might be confused if the error message mentions the value but the value does not appear in the SQL statement. You are concentrating on making it clear what the source is. Why not make it clear what the target is? I think that the other way to be clear about the problem is: return the character number of the token that is bad. But that is not a request about the message content. Re 'Type mismatch: can not convert 'some_text'(text) to integer' To a beginner this is helpful, to an experienced user it is redundant. I will guess that most users, after only a few days, will have enough experience that they know a literal inside single quotes is a string. So the question is: is this so obvious that it is irritating? I guess not. So I like this option. Of course I also like some texts in the SQL standard document, "data exception -- numeric value out of range" and "data exception -- invalid character value for cast", but I don't like them so much that I want to fight about them. Peter Gulutzan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description. 2020-06-22 18:43 ` Peter Gulutzan @ 2020-06-23 16:15 ` Mergen Imeev 2020-06-23 19:42 ` Peter Gulutzan 0 siblings, 1 reply; 6+ messages in thread From: Mergen Imeev @ 2020-06-23 16:15 UTC (permalink / raw) To: Peter Gulutzan, tarantool-discussions, korablev, Vladislav Shpilevoy, tsafin Hi! On 22.06.2020 21:43, Peter Gulutzan wrote: > Hi, > > On 2020-06-22 5:52 a.m., Mergen Imeev wrote: > > Hello Peter! > > > > I have two questions. > > > > The first. I am sorry to ask about this again, but I want to > > discuss once again the implicit cast for assignment between > > numeric types. I wrote earlier that I plan to use C rules for > > implicit casting. However, after implementation, this looks > > terribly wrong for me. Temporarily, I decided to allow implicit > > casting only if the number after conversion from one type to > > another using C rules and back remains unchanged. For example, a > > double 50.0 can be implicitly cast to an integer 50, but a > > double 50.5 cannot be cast to an integer, and an integer 2^60-1 > > cannot be cast to a double. > > > > However, I am not sure if this is correct. Could you suggest rules > > for implicit casting to assign numeric types? > > > > Re "double 50.5" > On 2020-04-30 in thread (CCed to several people and to dev) > "Re: [Tarantool-discussions] Implicit cast for ASSIGNMENT" > I read: > Mergen Imeev> I think that the rules for implicit casting when > assigning value > Mergen Imeev> of numeric type must be the same as in C. > Peter Gulutzan> You mean there should be truncation not rounding? > Mergen Imeev> Yes. > Therefore I expected > box.execute([[CREATE TABLE t0 (s1 INTEGER PRIMARY KEY);]]) > box.execute([[INSERT INTO t0 VALUES (?);]], {50.5}) > would be okay, result = 50. > > Re "integer 2^60-1": > Currently I can say: > box.execute([[CREATE TABLE t (s1 DOUBLE PRIMARY KEY);]]) > box.execute([[INSERT INTO t VALUES (?);]], 2^60 - 1) > and is okay today, result = 2^60 - 1 (I think). > > I suggest the rules that were discussed earlier: > Loss of significant digits (overflow) is a runtime error. > Loss of insignificant digits is not an error. > I think this should be possible because explicit casts work: > box.execute([[SELECT CAST(? AS INTEGER);]],{2^64}) -- result = error > box.execute([[SELECT CAST(50.5 AS INTEGER);]]) -- result = 50 > Thanks for the answer. However, I now have another question: should we apply these rules for cases like this: SELECT * FROM t LIMIT 2.5; I mean, we should throw an error here or execute it like this: SELECT * FROM t LIMIT 2; > > The second question is about type mismatch error text. > > > > We currently have two types of type mismatch errors. > > > > The first: > > - 'Type mismatch: can not convert some_text to integer' > > > > The second: > > - 'Type mismatch: can not convert text to integer' > > > > In the first case, we use the value in the error description, in > > the second we use the type. > > > > Now we have decided on the same opinion. > > > > We have 3 options: > > 1) use value in error description: > > Type mismatch: can not convert 'some_text' to integer > > 2) use type in error description: > > Type mismatch: can not convert text to integer > > 3) use both in error description: > > Type mismatch: can not convert 'some_text'(text) to integer > > > > The first and third options look a little unsafe, since the values > > will be shown in the logs. However, they provide more information > > than the second option. > > > > Which one do you think is the best? Or maybe you can offer another > > option? > > > > In addition, we must remember that at some point the description > > becomes the same as in Lua (issue #5074): > > 'Tuple field 1 type does not match one required by operation: > > expected integer' > > > > Re 'Type mismatch: can not convert text to integer': > > You use "type", but that is not as specific as it could be. > Why not "data type"? > > You use "can not", but in other error messages > I see "cannot", for example > "cannot create trigger on system table" (trigger.c) > Why be inconsistent? > > You use "text", but that is a synonym, not what typeof() returns. > Why not "string"? > > You use "convert", but look at the title of this email thread. > Why not "implicitly cast"? > Actually "conversion" is better now, it is in the manual. > But I could change the manual if people wanted "cast". > > Re 'Type mismatch: can not convert some_text to integer' > > I do not know what your plan is, when some_text is extremely > long, or when the value has non-displayable bytes / characters. > > I do not know what your plan is, when the source is not > is not a literal value, but instead is ? or is a > column name or is a function result. I guess the answer > is still "the value", but I might be confused if the > error message mentions the value but the value does not > appear in the SQL statement. > > You are concentrating on making it clear what the source is. > Why not make it clear what the target is? > > I think that the other way to be clear about the problem is: > return the character number of the token that is bad. > But that is not a request about the message content. > > Re 'Type mismatch: can not convert 'some_text'(text) to integer' > > To a beginner this is helpful, to an experienced user it > is redundant. I will guess that most users, after only a > few days, will have enough experience that they know a > literal inside single quotes is a string. So the question is: > is this so obvious that it is irritating? > I guess not. > So I like this option. > > Of course I also like some texts in the SQL standard document, > "data exception -- numeric value out of range" and > "data exception -- invalid character value for cast", > but I don't like them so much that I want to fight about them. > > Peter Gulutzan > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description. 2020-06-23 16:15 ` Mergen Imeev @ 2020-06-23 19:42 ` Peter Gulutzan 2020-06-25 8:02 ` Mergen Imeev 0 siblings, 1 reply; 6+ messages in thread From: Peter Gulutzan @ 2020-06-23 19:42 UTC (permalink / raw) To: Mergen Imeev, tarantool-discussions, korablev, Vladislav Shpilevoy, tsafin Hi, On 2020-06-23 10:15 a.m., Mergen Imeev wrote: <cut> > Thanks for the answer. However, I now have another question: > should we apply these rules for cases like this: > SELECT * FROM t LIMIT 2.5; > > I mean, we should throw an error here or execute it like this: > SELECT * FROM t LIMIT 2; <cut> In the SQL-standard equivalent of LIMIT n, which is FETCH FIRST n ROWS, that is a syntax error because "The declared type of <fetch first row count> shall be an exact numeric with scale 0 (zero)." For us it is a runtime error ... " tarantool> n = 0.000000001 --- ... tarantool> box.execute([[SELECT 5 LIMIT ?;]], {n}) --- - null - 'Failed to execute SQL statement: Only positive integers are allowed in the LIMIT clause' ... " and you are proposing that there should be no error. On the other hand, box.execute([[CREATE TABLE t (s1 VARCHAR(2.0) PRIMARY KEY);]]) causes a "Syntax error", and you are not proposing that we change that. Another case is box.execute([[SELECT SUBSTR('abcde',2.99);]]) which is legal, 2.99 is truncated to 2. Therefore, I think your proposal means: If n is not an integer, and that is not detected as a syntax error, then there should be no error or warning. And I think the precedent of substr() means: there should be truncation not rounding. If I have understood correctly, then I agree. Peter Gulutzan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description. 2020-06-23 19:42 ` Peter Gulutzan @ 2020-06-25 8:02 ` Mergen Imeev 2020-06-25 19:39 ` Peter Gulutzan 0 siblings, 1 reply; 6+ messages in thread From: Mergen Imeev @ 2020-06-25 8:02 UTC (permalink / raw) To: Peter Gulutzan, tarantool-discussions, korablev, Vladislav Shpilevoy, tsafin Hi, On 23.06.2020 22:42, Peter Gulutzan wrote: > Hi, > > On 2020-06-23 10:15 a.m., Mergen Imeev wrote: > <cut> > > Thanks for the answer. However, I now have another question: > > should we apply these rules for cases like this: > > SELECT * FROM t LIMIT 2.5; > > > > I mean, we should throw an error here or execute it like this: > > SELECT * FROM t LIMIT 2; > <cut> > > In the SQL-standard equivalent of LIMIT n, > which is FETCH FIRST n ROWS, that is a syntax > error because > "The declared type of <fetch first row count> > shall be an exact numeric with scale 0 (zero)." > For us it is a runtime error ... Should we generate a syntax error here instead of a runtime error? However, in this case, we will not be able to execute something like this: SELECT * FROM t LIMIT 1 + 2; I will fill an issue if you think so. Actually, generating an error here seems less painful than making an integer from a double. > " > tarantool> n = 0.000000001 > --- > ... > tarantool> box.execute([[SELECT 5 LIMIT ?;]], {n}) > --- > - null > - 'Failed to execute SQL statement: Only positive > integers are allowed in the LIMIT > clause' > ... > " > and you are proposing that there should be no error. No, I think the error should be here. However, I'm not sure, so I asked you. > On the other hand, > box.execute([[CREATE TABLE t (s1 VARCHAR(2.0) PRIMARY KEY);]]) > causes a "Syntax error", and you are not proposing > that we change that. > Another case is > box.execute([[SELECT SUBSTR('abcde',2.99);]]) > which is legal, 2.99 is truncated to 2. I think this is worth fixing. Or is it better the way it is now? > > Therefore, I think your proposal means: > If n is not an integer, > and that is not detected as a syntax error, > then there should be no error or warning. > And I think the precedent of substr() means: > there should be truncation not rounding. > > If I have understood correctly, then I agree. > > Peter Gulutzan > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description. 2020-06-25 8:02 ` Mergen Imeev @ 2020-06-25 19:39 ` Peter Gulutzan 0 siblings, 0 replies; 6+ messages in thread From: Peter Gulutzan @ 2020-06-25 19:39 UTC (permalink / raw) To: Mergen Imeev, tarantool-discussions, korablev, Vladislav Shpilevoy, tsafin Hi, On 2020-06-25 2:02 a.m., Mergen Imeev wrote: > Hi, > > On 23.06.2020 22:42, Peter Gulutzan wrote: >> Hi, >> >> On 2020-06-23 10:15 a.m., Mergen Imeev wrote: >> <cut> >> > Thanks for the answer. However, I now have another question: >> > should we apply these rules for cases like this: >> > SELECT * FROM t LIMIT 2.5; >> > >> > I mean, we should throw an error here or execute it like this: >> > SELECT * FROM t LIMIT 2; >> <cut> >> >> In the SQL-standard equivalent of LIMIT n, >> which is FETCH FIRST n ROWS, that is a syntax >> error because >> "The declared type of <fetch first row count> >> shall be an exact numeric with scale 0 (zero)." >> For us it is a runtime error ... > Should we generate a syntax error here instead of a runtime error? > However, in this case, we will not be able to execute something > like this: > SELECT * FROM t LIMIT 1 + 2; > > I will fill an issue if you think so. > > Actually, generating an error here seems less painful than making > an integer from a double. > >> " >> tarantool> n = 0.000000001 >> --- >> ... >> tarantool> box.execute([[SELECT 5 LIMIT ?;]], {n}) >> --- >> - null >> - 'Failed to execute SQL statement: Only positive >> integers are allowed in the LIMIT >> clause' >> ... >> " >> and you are proposing that there should be no error. > No, I think the error should be here. However, I'm not sure, so I > asked you. > >> On the other hand, >> box.execute([[CREATE TABLE t (s1 VARCHAR(2.0) PRIMARY KEY);]]) >> causes a "Syntax error", and you are not proposing >> that we change that. >> Another case is >> box.execute([[SELECT SUBSTR('abcde',2.99);]]) >> which is legal, 2.99 is truncated to 2. > > I think this is worth fixing. Or is it better the way it is now? > > >> >> Therefore, I think your proposal means: >> If n is not an integer, >> and that is not detected as a syntax error, >> then there should be no error or warning. >> And I think the precedent of substr() means: >> there should be truncation not rounding. >> >> If I have understood correctly, then I agree. >> >> Peter Gulutzan >> >> I did not understand correctly, but now I do. Your example SELECT * FROM t LIMIT 1 + 2; must be allowed because it is documented behaviour. The manual says, in section "LIMIT clause": "Expressions may contain integers and arithmetic operators or functions, for example ABS(-3 / 1). However, the result must be an integer value greater than or equal to zero." Therefore a syntax error check could only look for a single literal value, and would be unnecessary because the runtime check would continue to exist. Currently this is legal: SELECT char(1.1), randomblob(1.1), substr('a', 1.1, 1.1), zeroblob(1.1) GROUP BY 1.1 ORDER BY 1.1; But this is not legal: SELECT 1.1 LIMIT 1.1 OFFSET 1.1; That is not consistent handling of cases where (as far as I can see) the only sensible arguments are unsigned integers. Therefore, although this might cause a million-row insertion to fail on the millionth row, 1.1 should cause an error. A very-low-priority bug. Peter Gulutzan ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-06-25 19:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-22 11:52 [Tarantool-discussions] Implicit cast for assignment between numeric types and type mismatch error description Mergen Imeev 2020-06-22 18:43 ` Peter Gulutzan 2020-06-23 16:15 ` Mergen Imeev 2020-06-23 19:42 ` Peter Gulutzan 2020-06-25 8:02 ` Mergen Imeev 2020-06-25 19:39 ` Peter Gulutzan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox