[tarantool-patches] Re: [PATCH 01/13] sql: Convert big integers from string

Mon Apr 1 23:39:06 MSK 2019

On 25.03.2019 18:10, n.pettik wrote:
> 
> 
>> On 15 Mar 2019, at 18:45, Stanislav Zudin <szudin at tarantool.org> wrote:
>>
>> Correctly handles the signed and unsigned integers on the way from sql
>> expression to VDBE.
>> VDBE doesn't distinguish yet signed and unsigned integers.
> 
> Firstly, please add tag to each patch related to patch-set:
> 
> Part of #….
> 
> Or
> 
> Needed for #…

Got it.

> 
> Secondly, several tests fail on this patch:
> Failed tasks:
> - [sql-tap/cast.test.lua, memtx]
> - [sql-tap/tkt-a8a0d2996a.test.lua, memtx]
> - [sql-tap/misc3.test.lua, memtx]
> - [sql-tap/in4.test.lua, memtx]
> 
> sql/gh-2347-max-int-literals.test.lua           vinyl           [ fail ]
> 

The final commit does pass these tests.

> AFAIK we follow policy stating that each separate patch
> shouldn’t break existing tests. Mb you should reconsider
> structure of your patch-set.
> 
>> src/box/sql/expr.c   |  23 ++++---
>> src/box/sql/main.c   |   2 +-
>> src/box/sql/sqlInt.h |  63 +++++++++++------
>> src/box/sql/util.c   | 160 +++++++++++++++++++------------------------
>> src/box/sql/vdbe.c   |   2 +-
>> 5 files changed, 125 insertions(+), 125 deletions(-)
>>
>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
>> index 82688dff3..3fbfab7a0 100644
>> --- a/src/box/sql/expr.c
>> +++ b/src/box/sql/expr.c
>> @@ -3335,23 +3335,24 @@ expr_code_int(struct Parse *parse, struct Expr *expr, bool is_neg,
>> 		int64_t value;
>> 		const char *z = expr->u.zToken;
>> 		assert(z != NULL);
>> -		int c = sql_dec_or_hex_to_i64(z, &value);
>> -		if (c == 1 || (c == 2 && !is_neg) ||
>> -		    (is_neg && value == SMALLEST_INT64)) {
>> +		enum atoi_result c = sql_dec_or_hex_to_i64(z, is_neg, &value);
> 
> I suggest to make all atoi return -1 in case of errors (including overflow),
> and 0 in case of successful conversion. Error itself you can set via diag_set().
> Sign of the result you can pass via output param.
> This way would be more Tarantool-like.

Accepted. Fixed.

> 
> Moreover, instead of storing UINT in INT, I suggest to use fair enum and
> output param is_uint/is_negative/is_whatever/…
> 
> Finally, please add comment explaining that unsigned value is returned
> only in case when value can’t be fitted into int64 range. I don’t see
> such explanation, so at the first sight code may look weird.

Done.
BTW. The declaration of atoi* function contains such explanation

> 
>> +		switch(c) {
>> +		case ATOI_OVERFLOW:
>> 			if (sql_strnicmp(z, "0x", 2) == 0) {
>> 				sqlErrorMsg(parse,
>> -						"hex literal too big: %s%s",
>> -						is_neg ? "-" : "", z);
>> +					    "hex literal too big: %s%s",
>> +					    is_neg ? "-" : "", z);
>> 			} else {
>> 				sqlErrorMsg(parse,
>> -						"oversized integer: %s%s",
>> -						is_neg ? "-" : "", z);
>> +					    "oversized integer: %s%s",
>> +					    is_neg ? "-" : "", z);
>> 			}
>> -		} else {
>> -			if (is_neg)
>> -				value = c == 2 ? SMALLEST_INT64 : -value;
>> +			break;
>> +		case ATOI_UNSIGNED:
>> +		case ATOI_SIGNED:
>> 			sqlVdbeAddOp4Dup8(v, OP_Int64, 0, mem, 0,
>> -					      (u8 *)&value, P4_INT64);
>> +					  (u8 *)&value, P4_INT64);
>> +			break;
>> 		}
>> 	}
>> }
>>
>>
>> index c89e2e8ab..be77f72f8 100644
>> --- a/src/box/sql/util.c
>> +++ b/src/box/sql/util.c
>> @@ -591,110 +591,56 @@ sqlAtoF(const char *z, double *pResult, int length)
>> #endif				/* SQL_OMIT_FLOATING_POINT */
>> -int
>> +enum atoi_result
>> sql_atoi64(const char *z, int64_t *val, int length)
>> {
>> -	int incr = 1;
>> -	u64 u = 0;
>> 	int neg = 0;		/* assume positive */
>> -	int i;
>> -	int c = 0;
>> -	int nonNum = 0;		/* True if input contains UTF16 with high byte non-zero */
>> -	const char *zStart;
>> 	const char *zEnd = z + length;
>> -	incr = 1;
>> +	int incr = 1;
>> 	while (z < zEnd && sqlIsspace(*z))
>> 		z += incr;
>> -	if (z < zEnd) {
>> -		if (*z == '-') {
>> -			neg = 1;
>> -			z += incr;
>> -		} else if (*z == '+') {
>> -			z += incr;
>> -		}
>> -	}
>> -	zStart = z;
>> -	/* Skip leading zeros. */
>> -	while (z < zEnd && z[0] == '0') {
>> +
>> +	if (z >= zEnd)
>> +		return ATOI_OVERFLOW; /* invalid format */
> 
> It’s not fair to return OVERFLOW result in case of “invalid format”.
> See my comment concerning return values above.
> 
>> +	if (*z == '-') {
>> +		neg = 1;
>> 		z += incr;
>> 	}
>> -	for (i = 0; &z[i] < zEnd && (c = z[i]) >= '0' && c <= '9';
>> -	     i += incr) {
>> -		u = u * 10 + c - '0';
>> -	}
>> -	if (u > LARGEST_INT64) {
>> -		*val = neg ? SMALLEST_INT64 : LARGEST_INT64;
>> -	} else if (neg) {
>> -		*val = -(i64) u;
>> +
>> +	char* end = NULL;
> 
> According to docs, you should set errno to 0 before
> calling this function.

Good point.
Fixed.

> 
>> +	u64 u = strtoull(z, &end, 10);
>> +	if (end == z)
>> +		return ATOI_OVERFLOW;
> 
> In this case, it’s not overflow but error like “"No digits were found”.
> 
>> +	if (errno == ERANGE)
>> +		return ATOI_OVERFLOW;
> 
> Not only ERANGE errors can occur.
> Correct way of testing on occurred errors is shown here:
> https://linux.die.net/man/3/strtol
> 
>   if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN))
>              || (errno != 0 && val == 0))

IMO the (errno != 0) covers all faulty cases.

> 
> If these error should be caught before, then add assertions pls.
> 
>