[tarantool-patches] Re: [PATCH 01/13] sql: Convert big integers from string

Mon Mar 25 18:10:55 MSK 2019

> On 15 Mar 2019, at 18:45, Stanislav Zudin <szudin at tarantool.org> wrote:
> 
> Correctly handles the signed and unsigned integers on the way from sql
> expression to VDBE.
> VDBE doesn't distinguish yet signed and unsigned integers.

Firstly, please add tag to each patch related to patch-set:

Part of #….

Or

Needed for #…

Secondly, several tests fail on this patch:
Failed tasks:
- [sql-tap/cast.test.lua, memtx]
- [sql-tap/tkt-a8a0d2996a.test.lua, memtx]
- [sql-tap/misc3.test.lua, memtx]
- [sql-tap/in4.test.lua, memtx]

sql/gh-2347-max-int-literals.test.lua           vinyl           [ fail ]

AFAIK we follow policy stating that each separate patch
shouldn’t break existing tests. Mb you should reconsider
structure of your patch-set.

> src/box/sql/expr.c   |  23 ++++---
> src/box/sql/main.c   |   2 +-
> src/box/sql/sqlInt.h |  63 +++++++++++------
> src/box/sql/util.c   | 160 +++++++++++++++++++------------------------
> src/box/sql/vdbe.c   |   2 +-
> 5 files changed, 125 insertions(+), 125 deletions(-)
> 
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index 82688dff3..3fbfab7a0 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -3335,23 +3335,24 @@ expr_code_int(struct Parse *parse, struct Expr *expr, bool is_neg,
> 		int64_t value;
> 		const char *z = expr->u.zToken;
> 		assert(z != NULL);
> -		int c = sql_dec_or_hex_to_i64(z, &value);
> -		if (c == 1 || (c == 2 && !is_neg) ||
> -		    (is_neg && value == SMALLEST_INT64)) {
> +		enum atoi_result c = sql_dec_or_hex_to_i64(z, is_neg, &value);

I suggest to make all atoi return -1 in case of errors (including overflow),
and 0 in case of successful conversion. Error itself you can set via diag_set().
Sign of the result you can pass via output param.
This way would be more Tarantool-like.

Moreover, instead of storing UINT in INT, I suggest to use fair enum and
output param is_uint/is_negative/is_whatever/…

Finally, please add comment explaining that unsigned value is returned
only in case when value can’t be fitted into int64 range. I don’t see
such explanation, so at the first sight code may look weird. 

> +		switch(c) {
> +		case ATOI_OVERFLOW:
> 			if (sql_strnicmp(z, "0x", 2) == 0) {
> 				sqlErrorMsg(parse,
> -						"hex literal too big: %s%s",
> -						is_neg ? "-" : "", z);
> +					    "hex literal too big: %s%s",
> +					    is_neg ? "-" : "", z);
> 			} else {
> 				sqlErrorMsg(parse,
> -						"oversized integer: %s%s",
> -						is_neg ? "-" : "", z);
> +					    "oversized integer: %s%s",
> +					    is_neg ? "-" : "", z);
> 			}
> -		} else {
> -			if (is_neg)
> -				value = c == 2 ? SMALLEST_INT64 : -value;
> +			break;
> +		case ATOI_UNSIGNED:
> +		case ATOI_SIGNED:
> 			sqlVdbeAddOp4Dup8(v, OP_Int64, 0, mem, 0,
> -					      (u8 *)&value, P4_INT64);
> +					  (u8 *)&value, P4_INT64);
> +			break;
> 		}
> 	}
> }
> 
> 
> index c89e2e8ab..be77f72f8 100644
> --- a/src/box/sql/util.c
> +++ b/src/box/sql/util.c
> @@ -591,110 +591,56 @@ sqlAtoF(const char *z, double *pResult, int length)
> #endif				/* SQL_OMIT_FLOATING_POINT */
> -int
> +enum atoi_result
> sql_atoi64(const char *z, int64_t *val, int length)
> {
> -	int incr = 1;
> -	u64 u = 0;
> 	int neg = 0;		/* assume positive */
> -	int i;
> -	int c = 0;
> -	int nonNum = 0;		/* True if input contains UTF16 with high byte non-zero */
> -	const char *zStart;
> 	const char *zEnd = z + length;
> -	incr = 1;
> +	int incr = 1;
> 	while (z < zEnd && sqlIsspace(*z))
> 		z += incr;
> -	if (z < zEnd) {
> -		if (*z == '-') {
> -			neg = 1;
> -			z += incr;
> -		} else if (*z == '+') {
> -			z += incr;
> -		}
> -	}
> -	zStart = z;
> -	/* Skip leading zeros. */
> -	while (z < zEnd && z[0] == '0') {
> +
> +	if (z >= zEnd)
> +		return ATOI_OVERFLOW; /* invalid format */

It’s not fair to return OVERFLOW result in case of “invalid format”.
See my comment concerning return values above.

> +	if (*z == '-') {
> +		neg = 1;
> 		z += incr;
> 	}
> -	for (i = 0; &z[i] < zEnd && (c = z[i]) >= '0' && c <= '9';
> -	     i += incr) {
> -		u = u * 10 + c - '0';
> -	}
> -	if (u > LARGEST_INT64) {
> -		*val = neg ? SMALLEST_INT64 : LARGEST_INT64;
> -	} else if (neg) {
> -		*val = -(i64) u;
> +
> +	char* end = NULL;

According to docs, you should set errno to 0 before
calling this function.

> +	u64 u = strtoull(z, &end, 10);
> +	if (end == z)
> +		return ATOI_OVERFLOW;

In this case, it’s not overflow but error like “"No digits were found”.

> +	if (errno == ERANGE)
> +		return ATOI_OVERFLOW;

Not only ERANGE errors can occur.
Correct way of testing on occurred errors is shown here:
https://linux.die.net/man/3/strtol

 if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN))
            || (errno != 0 && val == 0))

If these error should be caught before, then add assertions pls.