From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 9612324EAD for ; Mon, 25 Feb 2019 13:32:21 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8bRSxwFrMjKa for ; Mon, 25 Feb 2019 13:32:21 -0500 (EST) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 54BE2217C8 for ; Mon, 25 Feb 2019 13:32:19 -0500 (EST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: [tarantool-patches] Re: [PATCH 1/2] sql: derive collation for built-in functions From: "n.pettik" In-Reply-To: <92715853-76c0-36ca-1bae-84a8a8939f7e@tarantool.org> Date: Mon, 25 Feb 2019 21:32:02 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <7CC1E113-479D-46FC-9A6F-6BE73E918FE7@tarantool.org> References: <92715853-76c0-36ca-1bae-84a8a8939f7e@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy > On 25 Feb 2019, at 15:58, Vladislav Shpilevoy = wrote: > Hi! Thanks for the patch! > On 21/02/2019 21:01, Nikita Pettik wrote: >> Functions such as trim(), substr() etc should return result with >> collation derived from their arguments. So, lets add flag indicating >> that collation of first argument must be applied to function's result = to >> SQL function definition. Using this flag, we can derive appropriate >> collation in sql_expr_coll(). >> Part of #3932 >> --- >> src/box/sql/analyze.c | 6 +++--- >> src/box/sql/expr.c | 23 +++++++++++++++++++++++ >> src/box/sql/func.c | 22 +++++++++++----------- >> src/box/sql/sqlInt.h | 31 +++++++++++++++++++++++-------- >> test/sql/collation.result | 28 ++++++++++++++++++++++++++++ >> test/sql/collation.test.lua | 11 +++++++++++ >> 6 files changed, 99 insertions(+), 22 deletions(-) >> diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h >> index 2830ab639..5fb7285d8 100644 >> --- a/src/box/sql/sqlInt.h >> +++ b/src/box/sql/sqlInt.h >> @@ -1633,6 +1633,13 @@ struct FuncDef { >> } u; >> /* Return type. */ >> enum field_type ret_type; >> + /** >> + * If function returns string, it may require collation >> + * to be applied on its result. For instance, result of >> + * substr() built-in function must have the same collation >> + * as its first argument. >> + */ >> + bool is_coll_derived; >> }; >=20 > This way works only for builtin functions taking not a > bind parameter ('?=E2=80=99). AFAIK, we can=E2=80=99t pass binding value with collation, we just have no means to do things like this: cn:execute('select trim(?)', { =E2=80=98ABCD=E2=80=99, collation =3D = =E2=80=9Cunicode_ci" }) On the other hand, we can do this: cn:execute('select trim(? COLLATE =E2=80=9Cunicode_ci")', { =E2=80=98ABCD=E2= =80=99}) > For user-defined functions and for > bind parameters it does not fit. How can you determine > a function's result collation, if it is not builtin, and > does not depend on arguments? In no way. We can extend signature of sql_create_function and allow to pass collation to be applied to returning value. But I am not sure that we should do this. Anyway, it wouldn't help us with the initial issue: in our case collation is dependent on one of arguments, so it *dynamically* changes. Hence, I guess these problems are barely related. Also, inlining comment from P.Gulutzan: (https://github.com/tarantool/tarantool/issues/3932) =E2=80=98=E2=80=99' It is true that user-defined functions will not know some things about=20= what an SQL caller is passing. We don't promise that they will, so I think it is okay that it is the caller's responsibility to make sure relevant information is passed explicitly. A possible issue is that the function cannot use the utf8 module for all possible collations, but that is not an SQL issue. =20 =E2=80=98'' > Does SQL standard allow to define user functions without > a runtime defined collation? If SQL standard does not define > SQL functions at all, then what other vendors do with that > problem? There=E2=80=99s no such opportunity in ANSI, if I=E2=80=99m not = mistaking. Generally speaking, other vendors have procedural SQL. And since PSQL is a part of SQL, there are no such problems: collation is a part of string-like types.