Tarantool development patches archive
 help / color / mirror / Atom feed
From: "n.pettik" <korablev@tarantool.org>
To: tarantool-patches@freelists.org
Cc: Ivan Koptelov <ivan.koptelov@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions
Date: Mon, 25 Mar 2019 22:26:41 +0300	[thread overview]
Message-ID: <6472870C-8952-43AC-9B86-8BB2E006502A@tarantool.org> (raw)
In-Reply-To: <fce30bfd8f7f77c2892c4d48e5fae13d562f8cae.1553078164.git.ivan.koptelov@tarantool.org>



> On 20 Mar 2019, at 14:11, Ivan Koptelov <ivan.koptelov@tarantool.org> wrote:
> 
> Before the patch determination of collation in SQL functions
> (used only in instr) was too narrow: the arguments were scanned
> from left to right, till the argument with collation was
> encountered, then its collation was used.
> Now every arguments collation is considered. The right collation
> which would be used in function is determined using ANSI
> compatibility rules ("type combination" rules).
> 
> Note: currently only instr() a.k.a position() uses mechanism
> described above, other functions (except aggregate) simply
> ignores collations.

That’s not true: I see that min-max aggregates also feature this flag:

FUNCTION(min, -1, 0, 1, minmaxFunc, FIELD_TYPE_SCALAR),

Fourth param indicates whether SQL_FUNC_NEEDCOLL is set or not.

> ---
> src/box/sql/expr.c   | 69 ++++++++++++++++++++++++++++++++++++++++----
> src/box/sql/sqlInt.h |  8 ++++-
> 2 files changed, 70 insertions(+), 7 deletions(-)
> 
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index a2c70935e..2f48d90c6 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -4093,16 +4093,73 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
> 					testcase(i == 31);
> 					constMask |= MASKBIT32(i);
> 				}
> -				if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) !=
> -				    0 && coll == NULL) {
> -					bool unused;
> -					uint32_t id;
> +			}
> +			/*
> +			 * Function arguments may have different
> +			 * collations. The following code
> +			 * checks if they are compatible and
> +			 * finds the collation to be used. This
> +			 * is done using ANSI rules from
> +			 * collations_check_compatibility().
> +			 */
> +			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
> +			     coll == NULL) {
> +				struct coll *unused = NULL;
> +				uint32_t curr_id = COLL_NONE;
> +				bool is_curr_forced = false;
> +
> +				uint32_t temp_id = COLL_NONE;
> +				bool is_temp_forced = false;
> +
> +				uint32_t lhs_id = COLL_NONE;
> +				bool is_lhs_forced = false;
> +
> +				uint32_t rhs_id = COLL_NONE;
> +				bool is_rhs_forced = false;
> +
> +				for (int i = 0; i < nFarg; i++) {
> 					if (sql_expr_coll(pParse,
> 							  pFarg->a[i].pExpr,
> -							  &unused, &id,
> -							  &coll) != 0)
> +							  &is_lhs_forced,
> +							  &lhs_id,
> +							  &unused) != 0)
> 						return 0;
> +
> +					for (int j = i + 1; j < nFarg; j++) {
> +						if (sql_expr_coll(pParse,
> +								  pFarg->a[j].pExpr,
> +								  &is_rhs_forced,
> +								  &rhs_id,
> +								  &unused) != 0)
> +							return 0;

Seems like you need only one pass saving resulting collation.
Resulting collation shouldn’t depend on way of passing through
arguments. And second call of collations_check_copatiility() is
redundant as well. Now you are using 2n^2 calls of this function,
but n is enough: you compare collation of first argument with
second one and save result in tmp. Then, compare tmp with
third argument etc.

> +
> +						if (collations_check_compatibility(
> +							lhs_id, is_lhs_forced,
> +							rhs_id, is_rhs_forced,
> +							&temp_id) != 0) {
> +							pParse->rc =
> +								SQL_TARANTOOL_ERROR;
> +							pParse->nErr++;
> +							return 0;
> +						}
> +
> +						is_temp_forced = (temp_id ==
> +								  lhs_id) ?
> +								 is_lhs_forced :
> +								 is_rhs_forced;
> +
> +						if (collations_check_compatibility(
> +							curr_id, is_curr_forced,
> +							temp_id, is_temp_forced,
> +							&curr_id) != 0) {
> +							pParse->rc =
> +								SQL_TARANTOOL_ERROR;
> +							pParse->nErr++;
> +							return 0;
> +						}
> +					}
> 				}
> +				coll = coll_by_id(curr_id)->coll;
> 			}
> 			if (pFarg) {
> 				if (constMask) {
> diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
> index 8967ea3e0..47ee474bb 100644
> --- a/src/box/sql/sqlInt.h
> +++ b/src/box/sql/sqlInt.h
> @@ -1660,7 +1660,13 @@ struct FuncDestructor {
> #define SQL_FUNC_LIKE     0x0004	/* Candidate for the LIKE optimization */
> #define SQL_FUNC_CASE     0x0008	/* Case-sensitive LIKE-type function */
> #define SQL_FUNC_EPHEM    0x0010	/* Ephemeral.  Delete with VDBE */
> -#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called */
> +#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called.
> +					 * The flag is set when the collation
> +					 * of function arguments should be
> +					 * determined, using rules in
> +					 * collations_check_compatibility()
> +					 * function.
> +					 */
> #define SQL_FUNC_LENGTH   0x0040	/* Built-in length() function */
> #define SQL_FUNC_TYPEOF   0x0080	/* Built-in typeof() function */
> #define SQL_FUNC_COUNT    0x0100	/* Built-in count(*) aggregate */
> -- 

Please, provide basic test cases involving one or more built-in
functions and incompatible arguments (at least min-max funcs use it).
Moreover, this flag can’t be set for user-defined functions, which is pretty sad.

  reply	other threads:[~2019-03-25 19:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-20 11:11 [tarantool-patches] [PATCH 0/2] sql: add better coll. determination & position func Ivan Koptelov
2019-03-20 11:11 ` [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions Ivan Koptelov
2019-03-25 19:26   ` n.pettik [this message]
2019-03-27 13:38     ` [tarantool-patches] " i.koptelov
2019-03-28 12:50       ` n.pettik
2019-03-28 14:08         ` i.koptelov
2019-03-29  9:57           ` n.pettik
2019-03-20 11:11 ` [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage Ivan Koptelov
2019-03-20 12:59   ` [tarantool-patches] Re: [PATCH 1/2] " i.koptelov
2019-03-26 12:32   ` [tarantool-patches] Re: [PATCH 2/2] " n.pettik
2019-03-27 13:39     ` i.koptelov
2019-03-28 12:57       ` n.pettik
2019-04-01 14:15 ` [tarantool-patches] Re: [PATCH 0/2] sql: add better coll. determination & position func Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6472870C-8952-43AC-9B86-8BB2E006502A@tarantool.org \
    --to=korablev@tarantool.org \
    --cc=ivan.koptelov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox