From: "n.pettik" <korablev@tarantool.org> To: tarantool-patches@freelists.org Cc: Ivan Koptelov <ivan.koptelov@tarantool.org> Subject: [tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions Date: Mon, 25 Mar 2019 22:26:41 +0300 [thread overview] Message-ID: <6472870C-8952-43AC-9B86-8BB2E006502A@tarantool.org> (raw) In-Reply-To: <fce30bfd8f7f77c2892c4d48e5fae13d562f8cae.1553078164.git.ivan.koptelov@tarantool.org> > On 20 Mar 2019, at 14:11, Ivan Koptelov <ivan.koptelov@tarantool.org> wrote: > > Before the patch determination of collation in SQL functions > (used only in instr) was too narrow: the arguments were scanned > from left to right, till the argument with collation was > encountered, then its collation was used. > Now every arguments collation is considered. The right collation > which would be used in function is determined using ANSI > compatibility rules ("type combination" rules). > > Note: currently only instr() a.k.a position() uses mechanism > described above, other functions (except aggregate) simply > ignores collations. That’s not true: I see that min-max aggregates also feature this flag: FUNCTION(min, -1, 0, 1, minmaxFunc, FIELD_TYPE_SCALAR), Fourth param indicates whether SQL_FUNC_NEEDCOLL is set or not. > --- > src/box/sql/expr.c | 69 ++++++++++++++++++++++++++++++++++++++++---- > src/box/sql/sqlInt.h | 8 ++++- > 2 files changed, 70 insertions(+), 7 deletions(-) > > diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c > index a2c70935e..2f48d90c6 100644 > --- a/src/box/sql/expr.c > +++ b/src/box/sql/expr.c > @@ -4093,16 +4093,73 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target) > testcase(i == 31); > constMask |= MASKBIT32(i); > } > - if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != > - 0 && coll == NULL) { > - bool unused; > - uint32_t id; > + } > + /* > + * Function arguments may have different > + * collations. The following code > + * checks if they are compatible and > + * finds the collation to be used. This > + * is done using ANSI rules from > + * collations_check_compatibility(). > + */ > + if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 && > + coll == NULL) { > + struct coll *unused = NULL; > + uint32_t curr_id = COLL_NONE; > + bool is_curr_forced = false; > + > + uint32_t temp_id = COLL_NONE; > + bool is_temp_forced = false; > + > + uint32_t lhs_id = COLL_NONE; > + bool is_lhs_forced = false; > + > + uint32_t rhs_id = COLL_NONE; > + bool is_rhs_forced = false; > + > + for (int i = 0; i < nFarg; i++) { > if (sql_expr_coll(pParse, > pFarg->a[i].pExpr, > - &unused, &id, > - &coll) != 0) > + &is_lhs_forced, > + &lhs_id, > + &unused) != 0) > return 0; > + > + for (int j = i + 1; j < nFarg; j++) { > + if (sql_expr_coll(pParse, > + pFarg->a[j].pExpr, > + &is_rhs_forced, > + &rhs_id, > + &unused) != 0) > + return 0; Seems like you need only one pass saving resulting collation. Resulting collation shouldn’t depend on way of passing through arguments. And second call of collations_check_copatiility() is redundant as well. Now you are using 2n^2 calls of this function, but n is enough: you compare collation of first argument with second one and save result in tmp. Then, compare tmp with third argument etc. > + > + if (collations_check_compatibility( > + lhs_id, is_lhs_forced, > + rhs_id, is_rhs_forced, > + &temp_id) != 0) { > + pParse->rc = > + SQL_TARANTOOL_ERROR; > + pParse->nErr++; > + return 0; > + } > + > + is_temp_forced = (temp_id == > + lhs_id) ? > + is_lhs_forced : > + is_rhs_forced; > + > + if (collations_check_compatibility( > + curr_id, is_curr_forced, > + temp_id, is_temp_forced, > + &curr_id) != 0) { > + pParse->rc = > + SQL_TARANTOOL_ERROR; > + pParse->nErr++; > + return 0; > + } > + } > } > + coll = coll_by_id(curr_id)->coll; > } > if (pFarg) { > if (constMask) { > diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h > index 8967ea3e0..47ee474bb 100644 > --- a/src/box/sql/sqlInt.h > +++ b/src/box/sql/sqlInt.h > @@ -1660,7 +1660,13 @@ struct FuncDestructor { > #define SQL_FUNC_LIKE 0x0004 /* Candidate for the LIKE optimization */ > #define SQL_FUNC_CASE 0x0008 /* Case-sensitive LIKE-type function */ > #define SQL_FUNC_EPHEM 0x0010 /* Ephemeral. Delete with VDBE */ > -#define SQL_FUNC_NEEDCOLL 0x0020 /* sqlGetFuncCollSeq() might be called */ > +#define SQL_FUNC_NEEDCOLL 0x0020 /* sqlGetFuncCollSeq() might be called. > + * The flag is set when the collation > + * of function arguments should be > + * determined, using rules in > + * collations_check_compatibility() > + * function. > + */ > #define SQL_FUNC_LENGTH 0x0040 /* Built-in length() function */ > #define SQL_FUNC_TYPEOF 0x0080 /* Built-in typeof() function */ > #define SQL_FUNC_COUNT 0x0100 /* Built-in count(*) aggregate */ > -- Please, provide basic test cases involving one or more built-in functions and incompatible arguments (at least min-max funcs use it). Moreover, this flag can’t be set for user-defined functions, which is pretty sad.
next prev parent reply other threads:[~2019-03-25 19:26 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-03-20 11:11 [tarantool-patches] [PATCH 0/2] sql: add better coll. determination & position func Ivan Koptelov 2019-03-20 11:11 ` [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions Ivan Koptelov 2019-03-25 19:26 ` n.pettik [this message] 2019-03-27 13:38 ` [tarantool-patches] " i.koptelov 2019-03-28 12:50 ` n.pettik 2019-03-28 14:08 ` i.koptelov 2019-03-29 9:57 ` n.pettik 2019-03-20 11:11 ` [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage Ivan Koptelov 2019-03-20 12:59 ` [tarantool-patches] Re: [PATCH 1/2] " i.koptelov 2019-03-26 12:32 ` [tarantool-patches] Re: [PATCH 2/2] " n.pettik 2019-03-27 13:39 ` i.koptelov 2019-03-28 12:57 ` n.pettik 2019-04-01 14:15 ` [tarantool-patches] Re: [PATCH 0/2] sql: add better coll. determination & position func Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=6472870C-8952-43AC-9B86-8BB2E006502A@tarantool.org \ --to=korablev@tarantool.org \ --cc=ivan.koptelov@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox