From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: "n.pettik" <korablev@tarantool.org>, tarantool-patches@freelists.org Subject: [tarantool-patches] Re: [PATCH 4/9] sql: introduce type boolean Date: Mon, 22 Apr 2019 21:02:20 +0300 [thread overview] Message-ID: <afbeb182-e5d0-4e5a-e010-57913e36d345@tarantool.org> (raw) In-Reply-To: <B0F64CC7-8817-4251-9849-5F89C5E398D0@tarantool.org> Hi! Thanks for the fixes! See 3 comments below. > diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c > index 6b38e8e66..d70b64c45 100644 > --- a/src/box/sql/expr.c > +++ b/src/box/sql/expr.c > @@ -3775,7 +3766,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target) > } > case TK_TRUE: > case TK_FALSE: { > - vdbe_emit_bool(pParse->pVdbe, pExpr, target); > + sqlVdbeAddOp2(v, OP_Bool, op == TK_TRUE, target); > return target; > } > >> 6. I see, that all other 'case's set pExpr->type. Why don't you >> do it here? > > It is required only for non-constant values. For literals it is assigned > in parse.y spanExpr(). 1. Then why case TK_INTEGER sets 'pExpr->type = FIELD_TYPE_INTEGER', TK_STRING sets 'pExpr->type = FIELD_TYPE_STRING;', TK_BLOB sets 'pExpr->type = FIELD_TYPE_SCALAR;' - I see all of them being set in parse.y. On the contrary, there are places which create a new Expr, and does not set type even in parse.y. For example: 1240: A.pExpr = sql_expr_new_dequoted(pParse->db, TK_INTEGER, &sqlIntTokens[N]); Here the expr is created and spanExpr() is not used. How does it work? >> 8. Looking at all these type comparison prohibitions I am getting >> afraid of how will we select from SCALAR columns? >> >> A schema: >> >> box.execute('CREATE TABLE t (id INTEGER PRIMARY KEY, a SCALAR UNIQUE);') >> box.execute("INSERT INTO t VALUES (1, 1), (2, true), (3, false), (4, 'str')") >> >> SQL select: >> >>> box.execute('SELECT * FROM t WHERE a > 1') >> --- >> - error: 'Type mismatch: can not convert INTEGER to boolean' >> ... >> >> The same Lua select: >> >>> box.space.T.index[1]:select({1}, {iterator = 'GT'}) >> --- >> - - [4, 'str'] >> ... >> >> In fact, now we can not use SCALAR in SQL for any comparisons >> because it will raise type mismatch on literally everything. >> >> What are we going to do with it? IMO, we could add a flag like >> 'is_scalar' to struct Mem in its flags section, which would allow >> to compare this value with anything. > > Konstantin suggested to extend struct Mem with field_type, > which is going to substitute this flag. > >> Looks crutchy though. And is >> not a part of this issue of course, but should be filed into the >> issue list. >> >> I see a similar issue https://github.com/tarantool/tarantool/issues/4124. >> Probably, it is worth extending it instead of opening new. > > I consider this as a trade-off of using scalar: users shouldn’t use this > type until they are really have to. Nevertheless, they still can use CAST > operator and switch case, or write their own lua-function converting values > to required type. 2. Users may have good designed schema and work in each request with only one type in such a column, but still they will get errors just because the iterator can not walk to the needed tree node. Lets modify my example: SELECT * FROM t WHERE a = 1; This request implicitly says that it will work with numbers only, but somehow its success depends on the content of the space, not even related to '1' value. In a nutshell it means that insertion of something into a space can break existing requests not related to the inserted data. It is weird. Such scalar type would be useless. >>> + * >>> + * @param str String to be converted to boolean. >>> + * Assumed to be null terminated. >>> + * @param result Resulting value of cast. >>> + * @retval 0 If string satisfies conditions above. >>> + * @retval -1 Otherwise. >>> + */ >>> +static int >>> +str_cast_to_boolean(const char *str, bool *result) >>> +{ >>> + assert(str != NULL); >>> + for (; *str == ' '; str++); >>> + size_t rest_str_len = strlen(str); >>> + if (rest_str_len == 4 && strncasecmp(str, "true", 4) == 0) { >>> + *result = true; >>> + return 0; >>> + } >>> + if (rest_str_len == 5 && strncasecmp(str, "false", 5) == 0) { >>> + *result = false; >>> + return 0; >>> + } >> >> 16. I guess it is far from the most efficient implementation. You >> calculate length of the whole string before comparison, scanning >> its twice in the worst and most common case. Please, consider my >> fixes below. I did not test them though, so probably there are some >> typos. >> Note, that in my fixes I've already accounted my comment about >> trimming trailing whitespaces. >> >> ============================================================================ >> @@ -625,16 +625,20 @@ str_cast_to_boolean(const char *str, bool *result) >> { >> assert(str != NULL); >> for (; *str == ' '; str++); >> - size_t rest_str_len = strlen(str); >> - if (rest_str_len == 4 && strncasecmp(str, "true", 4) == 0) { >> + if (strncasecmp(str, "true", 4) == 0) { > > We can’t use strncasecmp() without verification the fact that > rest_str_len (i.e. string length after skipping leading spaces) is >= 4. 3.1. Why? According to the standard, strncasecmp() compares not more than 'n' characters taking into account terminating zero. I tried these examples to simulate our case (when 'str' points to the beginning of a possible 'true/false' token). printf("%d\n", strncasecmp("", "true", 4)); printf("%d\n", strncasecmp("t", "true", 4)); printf("%d\n", strncasecmp("tr", "true", 4)); printf("%d\n", strncasecmp("tru", "true", 4)); printf("%d\n", strncasecmp("true ", "true", 4)); I got this output: -116 -114 -117 -101 0 It means, that you do not need to call 'strlen' before 'strncasecmp'. I've dropped 'strlen' again and the tests passed. If you think that explicit length check is mandatory, then please, provide a test. Probably I'm wrong somewhere in my speculations. > >> + str += 4; >> *result = true; >> - return 0; >> - } >> - if (rest_str_len == 5 && strncasecmp(str, "false", 5) == 0) { >> + } else if (strncasecmp(str, "false", 5) == 0) { >> + str += 5; >> *result = false; >> - return 0; >> + } else { >> + return -1; >> } >> - return -1; >> + for (; *str != 0; ++str) { >> + if (*str != ' ') >> + return -1; >> + } >> + return 0; >> } >> ============================================================================ > > Taking into account my comment diff is next: > > diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c > index 92406f938..19b42b992 100644 > --- a/src/box/sql/vdbemem.c > +++ b/src/box/sql/vdbemem.c > @@ -626,15 +626,20 @@ str_cast_to_boolean(const char *str, bool *result) > assert(str != NULL); > for (; *str == ' '; str++); > size_t rest_str_len = strlen(str); 3.2. My main point was that we should not call strlen() - it leads to double scan of the string. If you still need strlen(), then the fixes are useless. But consider my objectives above, about the POSIX standard. > - if (rest_str_len == 4 && strncasecmp(str, "true", 4) == 0) { > + if (rest_str_len >= 4 && strncasecmp(str, "true", 4) == 0) { > *result = true; > - return 0; > - } > - if (rest_str_len == 5 && strncasecmp(str, "false", 5) == 0) { > + str += 4; > + } else if (rest_str_len >= 5 && strncasecmp(str, "false", 5) == 0) { > *result = false; > - return 0; > + str += 5; > + } else { > + return -1; > } > - return -1; > + for (; *str != '\0'; ++str) { > + if (*str != ' ') > + return -1; > + } > + return 0; > } My fixes, on the branch: ================================================================== diff --git a/src/box/sql/func.c b/src/box/sql/func.c index 08bea9d75..dbbb7fe23 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -1102,8 +1102,8 @@ quoteFunc(sql_context * context, int argc, sql_value ** argv) break; } case MP_BOOL: { - sql_result_text(context, argv[0]->u.b ? "true" : "false", -1, - SQL_TRANSIENT); + sql_result_text(context, sql_value_boolean(argv[0]) ? + "true" : "false", -1, SQL_TRANSIENT); break; } default:{ diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h index 468c89521..1f77d34cd 100644 --- a/src/box/sql/sqlInt.h +++ b/src/box/sql/sqlInt.h @@ -583,7 +583,7 @@ int sql_column_int(sql_stmt *, int iCol); bool -sql_column_boolean(sql_stmt *stmt, int column); +sql_column_boolean(struct sql_stmt *stmt, int column); sql_int64 sql_column_int64(sql_stmt *, int iCol); diff --git a/src/box/sql/vdbeInt.h b/src/box/sql/vdbeInt.h index 73c77efed..db8458a6a 100644 --- a/src/box/sql/vdbeInt.h +++ b/src/box/sql/vdbeInt.h @@ -510,7 +510,7 @@ int sqlVdbeMemIntegerify(Mem *, bool is_forced); int sqlVdbeRealValue(Mem *, double *); int -mem_value_bool(Mem *mem, bool *b); +mem_value_bool(const struct Mem *mem, bool *b); int mem_apply_integer_type(Mem *); int sqlVdbeMemRealify(Mem *); diff --git a/src/box/sql/vdbeapi.c b/src/box/sql/vdbeapi.c index cfd65075e..9bbe6d2ad 100644 --- a/src/box/sql/vdbeapi.c +++ b/src/box/sql/vdbeapi.c @@ -991,7 +991,7 @@ sql_column_int(sql_stmt * pStmt, int i) } bool -sql_column_boolean(sql_stmt *stmt, int i) +sql_column_boolean(struct sql_stmt *stmt, int i) { bool val = sql_value_boolean(columnMem(stmt, i)); columnMallocFailure(stmt); diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c index 19b42b992..0bc1c9466 100644 --- a/src/box/sql/vdbemem.c +++ b/src/box/sql/vdbemem.c @@ -502,9 +502,9 @@ sqlVdbeRealValue(Mem * pMem, double *v) } int -mem_value_bool(Mem *mem, bool *b) +mem_value_bool(const struct Mem *mem, bool *b) { - if (mem->flags & MEM_Bool) { + if ((mem->flags & MEM_Bool) != 0) { *b = mem->u.b; return 0; } @@ -614,9 +614,9 @@ sqlVdbeMemNumerify(Mem * pMem) * For instance, " tRuE " can be successfully converted to * boolean value true. * - * @param str String to be converted to boolean. - * Assumed to be null terminated. - * @param result Resulting value of cast. + * @param str String to be converted to boolean. Assumed to be + * null terminated. + * @param[out] result Resulting value of cast. * @retval 0 If string satisfies conditions above. * @retval -1 Otherwise. */ @@ -625,11 +625,10 @@ str_cast_to_boolean(const char *str, bool *result) { assert(str != NULL); for (; *str == ' '; str++); - size_t rest_str_len = strlen(str); - if (rest_str_len >= 4 && strncasecmp(str, "true", 4) == 0) { + if (strncasecmp(str, "true", 4) == 0) { *result = true; str += 4; - } else if (rest_str_len >= 5 && strncasecmp(str, "false", 5) == 0) { + } else if (strncasecmp(str, "false", 5) == 0) { *result = false; str += 5; } else { @@ -680,7 +679,7 @@ sqlVdbeMemCast(Mem * pMem, enum field_type type) pMem->u.b = value; return 0; } - if ((pMem->flags & MEM_Bool) != 0) + if ((pMem->flags & MEM_Bool) != 0) return 0; return -1; case FIELD_TYPE_INTEGER:
next prev parent reply other threads:[~2019-04-22 18:02 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-04-14 15:03 [tarantool-patches] [PATCH 0/9] Introduce type BOOLEAN in SQL Nikita Pettik 2019-04-14 15:03 ` [tarantool-patches] [PATCH 1/9] sql: refactor mem_apply_numeric_type() Nikita Pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 2/9] sql: disallow text values participate in sum() aggregate Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 3/9] sql: use msgpack types instead of custom ones Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 4/9] sql: introduce type boolean Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy [this message] 2019-04-23 19:58 ` n.pettik 2019-04-23 21:06 ` Vladislav Shpilevoy 2019-04-14 15:04 ` [tarantool-patches] [PATCH 5/9] sql: improve type determination for column meta Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 6/9] sql: make comparison predicate return boolean Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 7/9] sql: make predicates accept and " Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:55 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 9/9] sql: make <search condition> accept only boolean Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:55 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:59 ` n.pettik 2019-04-23 21:06 ` Vladislav Shpilevoy 2019-04-23 22:01 ` n.pettik [not found] ` <b2a84f129c2343d3da3311469cbb7b20488a21c2.1555252410.git.korablev@tarantool.org> 2019-04-16 14:12 ` [tarantool-patches] Re: [PATCH 8/9] sql: make LIKE predicate return boolean result Vladislav Shpilevoy 2019-04-18 17:55 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-24 10:28 ` [tarantool-patches] Re: [PATCH 0/9] Introduce type BOOLEAN in SQL Vladislav Shpilevoy 2019-04-25 8:46 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=afbeb182-e5d0-4e5a-e010-57913e36d345@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH 4/9] sql: introduce type boolean' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox