From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 8073C29F18 for ; Fri, 15 Mar 2019 13:07:16 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aJRA0EkS6VhC for ; Fri, 15 Mar 2019 13:07:16 -0400 (EDT) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id CF03429E5C for ; Fri, 15 Mar 2019 13:07:15 -0400 (EDT) From: "n.pettik" Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_92FF203C-6576-4FC3-A807-C98D974D4F60" Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: [tarantool-patches] Re: [PATCH] sql: replace instr() with position() Date: Fri, 15 Mar 2019 20:07:13 +0300 In-Reply-To: References: <20190310074112.3260-1-ivan.koptelov@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org Cc: Ivan Koptelov --Apple-Mail=_92FF203C-6576-4FC3-A807-C98D974D4F60 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 >> Why these conditions can=E2=80=99t be compatible? >> What is the difference between SQL_FUNC_NEEDCOLL >> and new flag? Does NEEDCOLL refer to returning value? >> Please, comment this part. > These conditions can=E2=80=99t be compatible because > SQL_FUNC_NEEDCOLL and SQL_FUNC_ARG_COLL ask for > different actions. SQL_FUNC_NEEDCOLL asks > to find first argument with collation in > the arguments expression list (from left > to right) and use it inside function > as a collation of the whole expression. > SQL_FUNC_ARG_COLL asks to check if > arguments collations are compatible > and use the right one if they are. > I would add comment to the code. >=20 > If we want to establish proper collation > handling in all functions, we should > probably use only one flag to do check > and set the right collation. >=20 > I would merge these flags into one, OK? Yep. >>> bool unused; >>> uint32_t id; >>> if (sql_expr_coll(pParse, >>> @@ -4064,6 +4065,37 @@ sqlExprCodeTarget(Parse * pParse, Expr * = pExpr, int target) >>> return 0; >>> } >>> } >>> + if (pDef->funcFlags & SQL_FUNC_ARG_COLL) { >>> + assert(nFarg =3D=3D 2); >>=20 >> Add comment to this assertion or better to the whole branch. >> It is not obvious at first sight. Why only functions with two args=E2=80= =A6 >> For instance, there=E2=80=99s MAX function, which may take any number >> of arguments. Collations are also vital for this function, since >> they are used to compare strings. >>=20 >> What is more, I suggest to move this part in a separate patch, >> since position is not the only function that must apply collations >> on its operands. I don=E2=80=99t insist on dozens of tests on each = built-in >> function, just a few. > So do you propose to make two patches: first with adding proper > collation checking/setting in sqlExprCodeTarget() and another > with adding proper collation usage in functions in func.c? > Is it OK if I put them in one patch set? Ok, you can try. Anyway, you can always squash patches. >>=20 >>> + struct coll *colls[2] =3D {NULL}; >>=20 >> I=E2=80=99d rather use explicit naming: >> struct coll *lhs_coll =3D NULL; >> struct coll *rhs_coll =3D NULL; >>=20 >> It makes code a bit longer, but on the other hand IMHO >> makes it less prone to typos. > But this function sqlExprCodeTarget() is called for expression > list (of arguments). If we would use it for functions like MAX > (as we want to) then I think an array should be used. Or just > two loops to check every pair of arguments collations for > compatibility. You don=E2=80=99t need array: just go through arguments from left to the right and compare on compatibility their collations. >>=20 >>> + uint32_t colls_ids[2] =3D {0}; >>> + bool is_forced[2] =3D {false}; >>> + if (sql_expr_coll(pParse, = pFarg->a[0].pExpr, >>> + &is_forced[0], = &colls_ids[0], >>> + &colls[0]) !=3D 0) >>> + return 0; >>> + if (sql_expr_coll(pParse, = pFarg->a[1].pExpr, >>> + &is_forced[1], = &colls_ids[1], >>> + &colls[1]) !=3D 0) >>> + return 0; >>> + >>> + uint32_t coll_id; >>> + if = (collations_check_compatibility(colls_ids[0], >>> + = is_forced[0], >>> + = colls_ids[1], >>> + = is_forced[1], >>> + = &coll_id) >>> + !=3D = 0) { >>> + pParse->rc =3D = SQL_TARANTOOL_ERROR; >>> + pParse->nErr++; >>> + return 0; >>> + } >>> + >>> + coll =3D (coll_id =3D=3D colls_ids[0]) ? = colls[0] : >>> + = colls[1]; >>> + if (coll =3D=3D NULL) >>> + coll =3D = coll_by_id(COLL_NONE)->coll; >>=20 >> Why do we need this? > Because if there is no collation at all among > arguments, then the context would not have any > collation as well. And sqlGetFuncCollSeq() > would fail with assertion in this case, because > it always expects some collations being set in context. Hm, afair we added =E2=80=9Cnone=E2=80=9D collation to avoid such = situations. Will look at sql_expr_coll() and check if we can fix this. --Apple-Mail=_92FF203C-6576-4FC3-A807-C98D974D4F60 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

Why = these conditions can=E2=80=99t be compatible?
What is the = difference between SQL_FUNC_NEEDCOLL
and new flag? Does = NEEDCOLL refer to returning value?
Please, comment this = part.
These conditions can=E2=80=99t be compatible = because
SQL_FUNC_NEEDCOLL and SQL_FUNC_ARG_COLL ask for
different actions. = SQL_FUNC_NEEDCOLL asks
to find first argument with collation in
the arguments expression list = (from left
to right) and = use it inside function
as a collation of the whole expression.
SQL_FUNC_ARG_COLL asks to check = if
arguments = collations are compatible
and use the right one if they are.
I would add comment to the = code.

If we want to = establish proper collation
handling in all functions, we should
probably use only one flag to do = check
and set the = right collation.

I would merge these flags into one, OK?

Yep.
bool = unused;
uint32_t id;
if (sql_expr_coll(pParse,
@@ -4064,6 +4065,37 @@ sqlExprCodeTarget(Parse * pParse, Expr = * pExpr, int target)
return 0;
}
= = = }
+ if (pDef->funcFlags & = SQL_FUNC_ARG_COLL) {
+ assert(nFarg =3D=3D 2);

Add comment to this assertion or = better to the whole branch.
It is not obvious at first = sight. Why only functions with two args=E2=80=A6
For = instance, there=E2=80=99s MAX function, which may take any number
of arguments. Collations are also vital for this function, = since
they are used to compare strings.

What is more, I suggest to move this part in a separate = patch,
since position is not the only function that must = apply collations
on its operands. I don=E2=80=99t insist = on dozens of tests on each built-in
function, just a = few.
So do you propose to make two patches: first with adding = proper
collation = checking/setting in sqlExprCodeTarget() and another
with adding proper collation = usage in functions in func.c?
Is it OK if I put them in one patch set?

Ok, you can = try. Anyway, you can always squash patches.


+ struct = coll *colls[2] =3D {NULL};

I=E2=80= =99d rather use explicit naming:
struct coll *lhs_coll =3D = NULL;
struct coll *rhs_coll =3D NULL;

It makes code a bit longer, but on the other hand IMHO
makes it less prone to typos.
But this function = sqlExprCodeTarget() is called for expression
list (of arguments). If we would = use it for functions like MAX
(as we want to) then I think an array should be used. Or = just
two loops to = check every pair of arguments collations for
compatibility.

You don=E2=80= =99t need array: just go through arguments from left
to the = right and compare on compatibility their collations.


+ uint32_t = colls_ids[2] =3D {0};
+ bool is_forced[2] =3D {false};
+ = = = = if (sql_expr_coll(pParse, pFarg->a[0].pExpr,
+ = = = = = =   &is_forced[0], = &colls_ids[0],
+   &colls[0]) !=3D = 0)
+ return 0;
+ if (sql_expr_coll(pParse, = pFarg->a[1].pExpr,
+   &is_forced[1], = &colls_ids[1],
+   &colls[1]) !=3D = 0)
+ return 0;
+
+ uint32_t = coll_id;
+ if = (collations_check_compatibility(colls_ids[0],
+    is_forced[0],
+ = = = = = = = =    colls_ids[1],
+ = = = = = = = =    is_forced[1],
+ = = = = = = = =    &coll_id)
+ = = = = = = = =    !=3D= 0) {
+ pParse->rc =3D SQL_TARANTOOL_ERROR;
+ = pParse->nErr++;
+ return 0;
+ }
+
+ coll =3D (coll_id =3D=3D = colls_ids[0]) ? colls[0] :
+    colls[1];
+ = = = = if (coll =3D=3D NULL)
+ coll =3D = coll_by_id(COLL_NONE)->coll;

Why do we need this?
Because if there is no collation = at all among
arguments, = then the context would not have any
collation as well. And sqlGetFuncCollSeq()
would fail with assertion in = this case, because
it always expects some collations being set in = context.

Hm, afair we added =E2=80=9Cnone=E2=80=9D collation to avoid = such situations.
Will look at sql_expr_coll() and = check if we can fix this.

= --Apple-Mail=_92FF203C-6576-4FC3-A807-C98D974D4F60--