From: "n.pettik" <korablev@tarantool.org> To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Subject: [tarantool-patches] Re: [PATCH 4/9] sql: introduce type boolean Date: Tue, 23 Apr 2019 22:58:26 +0300 [thread overview] Message-ID: <45E647F0-79F4-41DE-BD69-996A2D3EB7F1@tarantool.org> (raw) In-Reply-To: <afbeb182-e5d0-4e5a-e010-57913e36d345@tarantool.org> >> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c >> index 6b38e8e66..d70b64c45 100644 >> --- a/src/box/sql/expr.c >> +++ b/src/box/sql/expr.c >> @@ -3775,7 +3766,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target) >> } >> case TK_TRUE: >> case TK_FALSE: { >> - vdbe_emit_bool(pParse->pVdbe, pExpr, target); >> + sqlVdbeAddOp2(v, OP_Bool, op == TK_TRUE, target); >> return target; >> } >> >>> 6. I see, that all other 'case's set pExpr->type. Why don't you >>> do it here? >> >> It is required only for non-constant values. For literals it is assigned >> in parse.y spanExpr(). > > 1. Then why case TK_INTEGER sets 'pExpr->type = FIELD_TYPE_INTEGER', > TK_STRING sets 'pExpr->type = FIELD_TYPE_STRING;', TK_BLOB sets > 'pExpr->type = FIELD_TYPE_SCALAR;' - I see all of them being set in > parse.y. Those are literals, so as I said above, their types are assigned right during parsing. If you are asking “why do we need to do so?”, then answer is “they are leaf nodes and it is most appropriate moment to do so”. > On the contrary, there are places which create a new Expr, and > does not set type even in parse.y. For example: > > 1240: A.pExpr = sql_expr_new_dequoted(pParse->db, TK_INTEGER, &sqlIntTokens[N]); Speaking of this particular example, yes, it was buggy place. I’ve fixed it in "sql: make predicates accept and return boolean" patch > Here the expr is created and spanExpr() is not used. How does it > work? > >>> 8. Looking at all these type comparison prohibitions I am getting >>> afraid of how will we select from SCALAR columns? >>> >>> A schema: >>> >>> box.execute('CREATE TABLE t (id INTEGER PRIMARY KEY, a SCALAR UNIQUE);') >>> box.execute("INSERT INTO t VALUES (1, 1), (2, true), (3, false), (4, 'str')") >>> >>> SQL select: >>> >>>> box.execute('SELECT * FROM t WHERE a > 1') >>> --- >>> - error: 'Type mismatch: can not convert INTEGER to boolean' >>> ... >>> >>> The same Lua select: >>> >>>> box.space.T.index[1]:select({1}, {iterator = 'GT'}) >>> --- >>> - - [4, 'str'] >>> ... >>> >>> In fact, now we can not use SCALAR in SQL for any comparisons >>> because it will raise type mismatch on literally everything. >>> >>> What are we going to do with it? IMO, we could add a flag like >>> 'is_scalar' to struct Mem in its flags section, which would allow >>> to compare this value with anything. >> >> Konstantin suggested to extend struct Mem with field_type, >> which is going to substitute this flag. >> >>> Looks crutchy though. And is >>> not a part of this issue of course, but should be filed into the >>> issue list. >>> >>> I see a similar issue https://github.com/tarantool/tarantool/issues/4124. >>> Probably, it is worth extending it instead of opening new. >> >> I consider this as a trade-off of using scalar: users shouldn’t use this >> type until they are really have to. Nevertheless, they still can use CAST >> operator and switch case, or write their own lua-function converting values >> to required type. > > 2. Users may have good designed schema and work in each request with only > one type in such a column, but still they will get errors just because > the iterator can not walk to the needed tree node. Lets modify my example: > > SELECT * FROM t WHERE a = 1; > > This request implicitly says that it will work with numbers only, but > somehow its success depends on the content of the space, not even > related to '1' value. In a nutshell it means that insertion of something > into a space can break existing requests not related to the inserted > data. It is weird. Such scalar type would be useless. Yep, we can observe the same behaviour in DB2. For instance: db2 => create table t (id varchar(10)) DB20000I The SQL command completed successfully. db2 => insert into t values(1) DB20000I The SQL command completed successfully. db2 => select id from t where id = 1 ID ---------- 1 1 record(s) selected. db2 => insert into t values('abc’) DB20000I The SQL command completed successfully. db2 => insert into t values(1) DB20000I The SQL command completed successfully. db2 => select id from t where id = 1 ID ---------- 1 SQL0420N Invalid character found in a character string argument of the function "DECFLOAT". SQLSTATE=22018 After insertion of unconvertible to integer value ‘abc’, select starts to fail. Note that DB2 is considered to be most compatible with ANSI according to members of our server team. >>>> + * >>>> + * @param str String to be converted to boolean. >>>> + * Assumed to be null terminated. >>>> + * @param result Resulting value of cast. >>>> + * @retval 0 If string satisfies conditions above. >>>> + * @retval -1 Otherwise. >>>> + */ >>>> +static int >>>> +str_cast_to_boolean(const char *str, bool *result) >>>> +{ >>>> + assert(str != NULL); >>>> + for (; *str == ' '; str++); >>>> + size_t rest_str_len = strlen(str); >>>> + if (rest_str_len == 4 && strncasecmp(str, "true", 4) == 0) { >>>> + *result = true; >>>> + return 0; >>>> + } >>>> + if (rest_str_len == 5 && strncasecmp(str, "false", 5) == 0) { >>>> + *result = false; >>>> + return 0; >>>> + } >>> >>> 16. I guess it is far from the most efficient implementation. You >>> calculate length of the whole string before comparison, scanning >>> its twice in the worst and most common case. Please, consider my >>> fixes below. I did not test them though, so probably there are some >>> typos. >>> Note, that in my fixes I've already accounted my comment about >>> trimming trailing whitespaces. >>> >>> ============================================================================ >>> @@ -625,16 +625,20 @@ str_cast_to_boolean(const char *str, bool *result) >>> { >>> assert(str != NULL); >>> for (; *str == ' '; str++); >>> - size_t rest_str_len = strlen(str); >>> - if (rest_str_len == 4 && strncasecmp(str, "true", 4) == 0) { >>> + if (strncasecmp(str, "true", 4) == 0) { >> >> We can’t use strncasecmp() without verification the fact that >> rest_str_len (i.e. string length after skipping leading spaces) is >= 4. > > 3.1. Why? According to the standard, strncasecmp() compares not more than > 'n' characters taking into account terminating zero. I tried these > examples to simulate our case (when 'str' points to the beginning of a > possible 'true/false' token). > > printf("%d\n", strncasecmp("", "true", 4)); > printf("%d\n", strncasecmp("t", "true", 4)); > printf("%d\n", strncasecmp("tr", "true", 4)); > printf("%d\n", strncasecmp("tru", "true", 4)); > printf("%d\n", strncasecmp("true ", "true", 4)); > > I got this output: > > -116 > -114 > -117 > -101 > 0 > > It means, that you do not need to call 'strlen' before > 'strncasecmp'. I've dropped 'strlen' again and the tests > passed. If you think that explicit length check is mandatory, > then please, provide a test. Probably I'm wrong somewhere in > my speculations. Well, you are right: I missed the fact that it compares UP to n symbols. I’m sorry to have confused you. >> - if (rest_str_len == 4 && strncasecmp(str, "true", 4) == 0) { >> + if (rest_str_len >= 4 && strncasecmp(str, "true", 4) == 0) { >> *result = true; >> - return 0; >> - } >> - if (rest_str_len == 5 && strncasecmp(str, "false", 5) == 0) { >> + str += 4; >> + } else if (rest_str_len >= 5 && strncasecmp(str, "false", 5) == 0) { >> *result = false; >> - return 0; >> + str += 5; >> + } else { >> + return -1; >> } >> - return -1; >> + for (; *str != '\0'; ++str) { >> + if (*str != ' ') >> + return -1; >> + } >> + return 0; >> } > > My fixes, on the branch: Ok, applied.
next prev parent reply other threads:[~2019-04-23 19:58 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-04-14 15:03 [tarantool-patches] [PATCH 0/9] Introduce type BOOLEAN in SQL Nikita Pettik 2019-04-14 15:03 ` [tarantool-patches] [PATCH 1/9] sql: refactor mem_apply_numeric_type() Nikita Pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 2/9] sql: disallow text values participate in sum() aggregate Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 3/9] sql: use msgpack types instead of custom ones Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 4/9] sql: introduce type boolean Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik [this message] 2019-04-23 21:06 ` Vladislav Shpilevoy 2019-04-14 15:04 ` [tarantool-patches] [PATCH 5/9] sql: improve type determination for column meta Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 6/9] sql: make comparison predicate return boolean Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:54 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 7/9] sql: make predicates accept and " Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:55 ` n.pettik 2019-04-14 15:04 ` [tarantool-patches] [PATCH 9/9] sql: make <search condition> accept only boolean Nikita Pettik 2019-04-16 14:12 ` [tarantool-patches] " Vladislav Shpilevoy 2019-04-18 17:55 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:59 ` n.pettik 2019-04-23 21:06 ` Vladislav Shpilevoy 2019-04-23 22:01 ` n.pettik [not found] ` <b2a84f129c2343d3da3311469cbb7b20488a21c2.1555252410.git.korablev@tarantool.org> 2019-04-16 14:12 ` [tarantool-patches] Re: [PATCH 8/9] sql: make LIKE predicate return boolean result Vladislav Shpilevoy 2019-04-18 17:55 ` n.pettik 2019-04-22 18:02 ` Vladislav Shpilevoy 2019-04-23 19:58 ` n.pettik 2019-04-24 10:28 ` [tarantool-patches] Re: [PATCH 0/9] Introduce type BOOLEAN in SQL Vladislav Shpilevoy 2019-04-25 8:46 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=45E647F0-79F4-41DE-BD69-996A2D3EB7F1@tarantool.org \ --to=korablev@tarantool.org \ --cc=tarantool-patches@freelists.org \ --cc=v.shpilevoy@tarantool.org \ --subject='[tarantool-patches] Re: [PATCH 4/9] sql: introduce type boolean' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox