From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id D90EF2A889 for ; Mon, 15 Apr 2019 20:14:26 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FQp6-J1VhEXl for ; Mon, 15 Apr 2019 20:14:26 -0400 (EDT) Received: from smtp53.i.mail.ru (smtp53.i.mail.ru [94.100.177.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 1C0452A886 for ; Mon, 15 Apr 2019 20:14:26 -0400 (EDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: [tarantool-patches] Re: [PATCH] sql: modify TRIM() function signature From: Roman Khabibov In-Reply-To: <07e149a4-679a-53f1-ccf9-78219cc0ec47@tarantool.org> Date: Tue, 16 Apr 2019 03:14:23 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20190411173326.36120-1-roman.habibov@tarantool.org> <07e149a4-679a-53f1-ccf9-78219cc0ec47@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy Hi! Thanks for the review. > On Apr 14, 2019, at 9:01 PM, Vladislav Shpilevoy = wrote: >=20 > Hi! Thanks for the patch! See 20 comments below. >=20 > 1. Please, do all the issues on the master branch. > We cherry-pick on other branches from the master. >=20 > 2. Use the newest version of the branch. Your is outdated > on more than 2 weeks. Rebased on 2.1. > On 11/04/2019 20:33, Roman Khabibov wrote: >> According to the ANSI standart, ltrim, rtrim and trim should >=20 > 3. Use a spell checker. Sublime text editor has it too. >=20 > standart -> standard >=20 >> be merged into one unified TRIM() function. The specialization of >> trimming (left, right or both and trimming charcters) determined >=20 > charcters -> characters sql: modify TRIM() function signature =20 According to the ANSI standard, ltrim, rtrim and trim should be merged into one unified TRIM() function. The specialization of trimming (left, right or both and trimming characters) determined in arguments of this function. =20 Closes #3879 >> in arguments of this function. >>=20 >> Closes #3879 >> --- >> Branch: = https://github.com/tarantool/tarantool/tree/romanhabibov/gh-3879-trim >> Issue: https://github.com/tarantool/tarantool/issues/3879 >>=20 >> extra/mkkeywordhash.c | 5 ++ >> src/box/sql/func.c | 46 ++++++++------ >> src/box/sql/global.c | 6 +- >> src/box/sql/parse.y | 48 +++++++++++++++ >> test/sql-tap/badutf1.test.lua | 14 ++--- >> test/sql-tap/func.test.lua | 111 = ++++++++++++++++++++++------------ >> test/sql-tap/with1.test.lua | 2 +- >> 7 files changed, 165 insertions(+), 67 deletions(-) >>=20 >> diff --git a/extra/mkkeywordhash.c b/extra/mkkeywordhash.c >> index be7bd5545..94a768323 100644 >> --- a/extra/mkkeywordhash.c >> +++ b/extra/mkkeywordhash.c >> @@ -91,6 +91,7 @@ struct Keyword { >> # define CTE 0x00040000 >> #endif >> # define RESERVED 0x00000001 >> +# define FUNCTION 0x00080000 >=20 > 4. These fields are stored in struct Keyword.mask field, which > is never used for anything more complex than 'mask =3D=3D 0' check. > And I do not see a reason why do you need here anything but > 'ALWAYS' constant. Also, see the issue: >=20 > https://github.com/tarantool/tarantool/issues/4155 diff --git a/extra/mkkeywordhash.c b/extra/mkkeywordhash.c index be7bd5545..76e3265e7 100644 --- a/extra/mkkeywordhash.c +++ b/extra/mkkeywordhash.c @@ -278,6 +278,10 @@ static Keyword aKeywordTable[] =3D { { "WHILE", "TK_STANDARD", RESERVED, true = }, { "TEXT", "TK_TEXT", RESERVED, true = }, { "TRUNCATE", "TK_TRUNCATE", ALWAYS, true = }, + { "TRIM", "TK_TRIM", ALWAYS, true = }, + { "LEADING", "TK_LEADING", ALWAYS, true = }, + { "TRAILING", "TK_TRAILING", ALWAYS, true = }, + { "BOTH", "TK_BOTH", ALWAYS, true = }, }; > 8. Please, create a enum with normal names for these constants. +enum trim_specification { + LEADING =3D 1, + TRAILING =3D 2, + BOTH =3D 3 +}; > 12. Pointers should be compared with NULL, not 0. + if (sql_value_type(argv[0]) =3D=3D SQL_NULL) { + return; + } + if ((input_str =3D sql_value_text(argv[0])) =3D=3D NULL) { + return; + } >=20 >> return; >> } else { >> const unsigned char *z =3D zCharSet; >> - int trim_set_sz =3D sql_value_bytes(argv[1]); >> + int trim_set_sz =3D sql_value_bytes(argv[source_index - = 1]); >> /* >> * Count the number of UTF-8 characters passing >> * through the entire char set, but not up >> @@ -1272,8 +1288,7 @@ trimFunc(sql_context * context, int argc, = sql_value ** argv) >> } >> } >> if (nChar > 0) { >> - flags =3D SQL_PTR_TO_INT(sql_user_data(context)); >> - if (flags & 1) { >> + if (trim_side & 1) { >=20 > 13. When checking flags, use (flag & ...) !=3D 0 instead of an > implicit conversion. In other places too. + if ((flags & 1) !=3D 0) { + if ((flags & 2) !=3D 0) { >=20 >> while (nIn > 0) { >> int len =3D 0; >> for (i =3D 0; i < nChar; i++) { >> @@ -1288,7 +1303,7 @@ trimFunc(sql_context * context, int argc, = sql_value ** argv) >> nIn -=3D len; >> } >> } >> - if (flags & 2) { >> + if (trim_side & 2) { >> while (nIn > 0) { >> int len =3D 0; >> for (i =3D 0; i < nChar; i++) { >> @@ -1738,12 +1753,9 @@ sqlRegisterBuiltinFunctions(void) >> FIELD_TYPE_INTEGER), >> FUNCTION2(likely, 1, 0, 0, noopFunc, SQL_FUNC_UNLIKELY, >> FIELD_TYPE_INTEGER), >> - FUNCTION_COLL(ltrim, 1, 1, 0, trimFunc), >> - FUNCTION_COLL(ltrim, 2, 1, 0, trimFunc), >> - FUNCTION_COLL(rtrim, 1, 2, 0, trimFunc), >> - FUNCTION_COLL(rtrim, 2, 2, 0, trimFunc), >> FUNCTION_COLL(trim, 1, 3, 0, trimFunc), >> FUNCTION_COLL(trim, 2, 3, 0, trimFunc), >> + FUNCTION_COLL(trim, 3, 3, 0, trimFunc), >=20 > 14. Better write three trim functions taking different number of > args, converting them to normal types, and calling the single > trim function. Instead of making a pile of 'if's about argc inside > the current implementation. Done. But now I have dublicated pieces of code: + const unsigned char *input_str; + assert(argc =3D=3D 1); + (void) argc; + + if (sql_value_type(argv[0]) =3D=3D SQL_NULL) { + return; + } + if ((input_str =3D sql_value_text(argv[0])) =3D=3D NULL) { + return; + } + + int input_str_sz =3D sql_value_bytes(argv[0]); + assert(input_str =3D=3D sql_value_text(argv[0])); >=20 >> FUNCTION(min, -1, 0, 1, minmaxFunc, FIELD_TYPE_SCALAR), >> FUNCTION(min, 0, 0, 1, 0, FIELD_TYPE_SCALAR), >> AGGREGATE2(min, 1, 0, 1, minmaxStep, minMaxFinalize,> = diff --git a/src/box/sql/parse.y b/src/box/sql/parse.y >> index d2614d9b0..53e5fd932 100644 >> --- a/src/box/sql/parse.y >> +++ b/src/box/sql/parse.y >> @@ -937,6 +937,54 @@ expr(A) ::=3D CAST(X) LP expr(E) AS typedef(T) = RP(Y). { >> sqlExprAttachSubtrees(pParse->db, A.pExpr, E.pExpr, 0); >> } >> %endif SQL_OMIT_CAST >> + >> +expr(A) ::=3D TRIM(X) LP trim_operands(Y) RP(E). { >> + A.pExpr =3D sqlExprFunction(pParse, Y, &X); >> + spanSet(&A, &X, &E); >> +} >> + >> +%type trim_operands {ExprList*} >=20 > 15. In new code we use 'struct' before each struct > type, and we put a whitespace before '*'. The same in > other places. Done. >> + A =3D sql_expr_list_append(pParse->db, A, Y); >> + } >> +} >> + >> +from_clause(A) ::=3D trim_set(Y) FROM. { >> + A =3D sql_expr_list_append(pParse->db, NULL, Y); >> +} >> + >> +%type trim_set {Expr*} >> +%destructor trim_set {sql_expr_delete(pParse->db, $$, false);} >> + >> +trim_set(A) ::=3D . {A =3D 0;} >=20 > 17. The same. Assign NULL, not 0. +trim_character(A) ::=3D . { A =3D NULL; } >=20 >> +trim_set(A) ::=3D expr(X). {A =3D X.pExpr;} >> + >> +%type trim_specification {int} >> + >> +trim_specification(A) ::=3D LEADING. {A =3D 1;} >> +trim_specification(A) ::=3D TRAILING. {A =3D 2;} >> +trim_specification(A) ::=3D BOTH. {A =3D 3;} >=20 > 18. Why is the grammar so complex? In the standard its > definition takes 12 lines. Because I haven=E2=80=99t found another ways to implement grammar, that = will be able to be compiled. > In your grammar you've allowed this: TRIM(FROM str). > But it is prohibited by the standard, and leads to an > assertion: Now I prohibit that. > 19. I've refactored the grammar a bit, but it can't be compiled. My > diff is below. Probably it can help. +test:do_catchsql_test( + "func-22.38", + [[ + SELECT TRIM(FROM 'xyxzy'); + ]], { + -- + 1, "Syntax error near 'FROM'" + -- + }) + > 20. 3879 was not about 'BOTH' only. Please, describe the > issue in more details, and test the whole grammar. As the > assertion fail above shows, you didn=E2=80=99t. +-- gh-3879 Check new TRIM() grammar, particularly BOTH keyword and FROM = without +-- any agrs before. LEADING and TRAILING keywords is checked above. commit f8b3475d9c5f4f479c2ee1709c78a16e1f02aec9 Author: Roman Khabibov Date: Thu Mar 28 14:01:33 2019 +0300 sql: modify TRIM() function signature =20 According to the ANSI standard, ltrim, rtrim and trim should be merged into one unified TRIM() function. The specialization of trimming (left, right or both and trimming characters) determined in arguments of this function. =20 Closes #3879 diff --git a/extra/mkkeywordhash.c b/extra/mkkeywordhash.c index be7bd5545..76e3265e7 100644 --- a/extra/mkkeywordhash.c +++ b/extra/mkkeywordhash.c @@ -278,6 +278,10 @@ static Keyword aKeywordTable[] =3D { { "WHILE", "TK_STANDARD", RESERVED, true = }, { "TEXT", "TK_TEXT", RESERVED, true = }, { "TRUNCATE", "TK_TRUNCATE", ALWAYS, true = }, + { "TRIM", "TK_TRIM", ALWAYS, true = }, + { "LEADING", "TK_LEADING", ALWAYS, true = }, + { "TRAILING", "TK_TRAILING", ALWAYS, true = }, + { "BOTH", "TK_BOTH", ALWAYS, true = }, }; =20 /* Number of keywords */ diff --git a/src/box/sql/func.c b/src/box/sql/func.c index abeecefa1..bf7e7a652 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -1286,108 +1286,223 @@ replaceFunc(sql_context * context, int argc, = sql_value ** argv) sql_result_text(context, (char *)zOut, j, sql_free); } =20 -/* - * Implementation of the TRIM(), LTRIM(), and RTRIM() functions. - * The userdata is 0x1 for left trim, 0x2 for right trim, 0x3 for both. +enum trim_specification { + LEADING =3D 1, + TRAILING =3D 2, + BOTH =3D 3 +}; + +/** + * Remove chars included into @a collation from @a input_str. + * @param context SQL context. + * @param flags Trim specification: left, right or both. + * @param collation Character set. + * @param coll_sz Character set size in bytes. + * @param input_str Input string for trimming. + * @param input_str_sz Input string size in bytes. */ static void -trimFunc(sql_context * context, int argc, sql_value ** argv) +trim_procedure(sql_context * context, enum trim_specification flags, + const unsigned char *collation, int coll_sz, + const unsigned char *input_str, int input_str_sz) { - const unsigned char *zIn; /* Input string */ - const unsigned char *zCharSet; /* Set of characters to trim */ - int nIn; /* Number of bytes in input */ - int flags; /* 1: trimleft 2: trimright 3: trim */ - int i; /* Loop counter */ - unsigned char *aLen =3D 0; /* Length of each character in = zCharSet */ - unsigned char **azChar =3D 0; /* Individual characters in = zCharSet */ - int nChar; /* Number of characters in zCharSet */ + int i; + /*=20 + * Length of each character in collation. + */ + unsigned char *aLen =3D 0; + /* + * Individual characters in collation. + */ + unsigned char **azChar =3D 0; + /*=20 + * Number of characters in collation. + */ + int nChar; =20 - if (sql_value_type(argv[0]) =3D=3D SQL_NULL) { - return; - } - zIn =3D sql_value_text(argv[0]); - if (zIn =3D=3D 0) - return; - nIn =3D sql_value_bytes(argv[0]); - assert(zIn =3D=3D sql_value_text(argv[0])); - if (argc =3D=3D 1) { - static const unsigned char lenOne[] =3D { 1 }; - static unsigned char *const azOne[] =3D { (u8 *) " " }; - nChar =3D 1; - aLen =3D (u8 *) lenOne; - azChar =3D (unsigned char **)azOne; - zCharSet =3D 0; - } else if ((zCharSet =3D sql_value_text(argv[1])) =3D=3D 0) { - return; - } else { - const unsigned char *z =3D zCharSet; - int trim_set_sz =3D sql_value_bytes(argv[1]); - /* - * Count the number of UTF-8 characters passing - * through the entire char set, but not up - * to the '\0' or X'00' character. This allows - * to handle trimming set containing such - * characters. - */ - nChar =3D sql_utf8_char_count(z, trim_set_sz); - if (nChar > 0) { - azChar =3D - contextMalloc(context, - ((i64) nChar) * (sizeof(char = *) + 1)); - if (azChar =3D=3D 0) { - return; - } - aLen =3D (unsigned char *)&azChar[nChar]; - z =3D zCharSet; - i =3D 0; - nChar =3D 0; - int handled_bytes_cnt =3D trim_set_sz; - while(handled_bytes_cnt > 0) { - azChar[nChar] =3D (unsigned char *)(z + = i); - SQL_UTF8_FWD_1(z, i, trim_set_sz); - aLen[nChar] =3D (u8) (z + i - = azChar[nChar]); - handled_bytes_cnt -=3D aLen[nChar]; - nChar++; - } + const unsigned char *z =3D collation; + /* + * Count the number of UTF-8 characters passing + * through the entire char set, but not up + * to the '\0' or X'00' character. This allows + * to handle trimming set containing such + * characters. + */ + nChar =3D sql_utf8_char_count(z, coll_sz); + if (nChar > 0) { + azChar =3D + contextMalloc(context, + ((i64) nChar) * (sizeof(char *) + 1)); + if (azChar =3D=3D 0) { + return; + } + aLen =3D (unsigned char *)&azChar[nChar]; + z =3D collation; + i =3D 0; + nChar =3D 0; + int handled_bytes_cnt =3D coll_sz; + while(handled_bytes_cnt > 0) { + azChar[nChar] =3D (unsigned char *)(z + i); + SQL_UTF8_FWD_1(z, i, coll_sz); + aLen[nChar] =3D (u8) (z + i - azChar[nChar]); + handled_bytes_cnt -=3D aLen[nChar]; + nChar++; } } if (nChar > 0) { - flags =3D SQL_PTR_TO_INT(sql_user_data(context)); - if (flags & 1) { - while (nIn > 0) { + if ((flags & 1) !=3D 0) { + while (input_str_sz > 0) { int len =3D 0; for (i =3D 0; i < nChar; i++) { len =3D aLen[i]; - if (len <=3D nIn - && memcmp(zIn, azChar[i], = len) =3D=3D 0) + if (len <=3D input_str_sz + && memcmp(input_str, + azChar[i], len) =3D=3D= 0) break; } if (i >=3D nChar) break; - zIn +=3D len; - nIn -=3D len; + input_str +=3D len; + input_str_sz -=3D len; } } - if (flags & 2) { - while (nIn > 0) { + if ((flags & 2) !=3D 0) { + while (input_str_sz > 0) { int len =3D 0; for (i =3D 0; i < nChar; i++) { len =3D aLen[i]; - if (len <=3D nIn - && memcmp(&zIn[nIn - len], + if (len <=3D input_str_sz + && = memcmp(&input_str[input_str_sz - len], azChar[i], len) =3D=3D= 0) break; } if (i >=3D nChar) break; - nIn -=3D len; + input_str_sz -=3D len; } } - if (zCharSet) { + if (collation) { sql_free(azChar); } } - sql_result_text(context, (char *)zIn, nIn, SQL_TRANSIENT); + sql_result_text(context, (char *)input_str,input_str_sz, + SQL_TRANSIENT); +} + +/** + * Normalize args from @a argv input array when it has one arg only. + *=20 + * Case: TRIM() + * Call trimming procedure with BOTH as the flags and " " as the = collation. + * + * @param context SQL context. + * @param argc Number of args. + * @param argv Args array. + */ +static void +trim_func_one_arg(sql_context * context, int argc, sql_value **argv) +{ + const unsigned char *input_str; + assert(argc =3D=3D 1); + (void) argc; + + if (sql_value_type(argv[0]) =3D=3D SQL_NULL) { + return; + } + if ((input_str =3D sql_value_text(argv[0])) =3D=3D NULL) { + return; + } + + int input_str_sz =3D sql_value_bytes(argv[0]); + assert(input_str =3D=3D sql_value_text(argv[0])); + + trim_procedure(context, BOTH, (const unsigned char *) " ", + 1, input_str, input_str_sz); +} + +/** + * Normalize args from @a argv input array when it has two args. + *=20 + * Case: TRIM( FROM ) + * If user has specified only, call trimming procedure = with + * BOTH as the flags and that collation. + * + * Case: TRIM(LEADING/TRAILING/BOTH FROM ) + * If user has specified side keyword only, call trimming procedure + * with the specified side and " " as the collation. + * + * @param context SQL context. + * @param argc Number of args. + * @param argv Args array. + */ +static void +trim_func_two_args(sql_context * context, int argc, sql_value **argv) +{ + const unsigned char *input_str; + assert(argc =3D=3D 2); + (void) argc; + + if (sql_value_type(argv[1]) =3D=3D SQL_NULL) { + return; + } + if ((input_str =3D sql_value_text(argv[1])) =3D=3D NULL) { + return; + } + + int input_str_sz =3D sql_value_bytes(argv[1]); + assert(input_str =3D=3D sql_value_text(argv[1])); + + const unsigned char *collation; + if (sql_value_type(argv[0]) =3D=3D SQL_INTEGER) { + trim_procedure(context, sql_value_int(argv[0]), + (const unsigned char *) " ", 1, + input_str, input_str_sz); + } else if ((collation =3D sql_value_text(argv[0])) =3D=3D NULL) = { + return; + } else { + int coll_sz =3D sql_value_bytes(argv[0]); + trim_procedure(context, BOTH, collation, coll_sz, = input_str, + input_str_sz); + } +} + +/** + * Normalize args from @a argv input array when it has three args. + * + * Case: TRIM(LEADING/TRAILING/BOTH FROM ) + * User has specified side keyword and , call trimming + * procedure with that args. + * + * @param context SQL context. + * @param argc Number of args. + * @param argv Args array. + */ +static void +trim_func_three_args(sql_context * context, int argc, sql_value **argv) +{ + const unsigned char *input_str; + assert(argc =3D=3D 3); + (void) argc; + + if (sql_value_type(argv[2]) =3D=3D SQL_NULL) { + return; + } + if ((input_str =3D sql_value_text(argv[2])) =3D=3D NULL) { + return; + } + + int input_str_sz =3D sql_value_bytes(argv[2]); + assert(input_str =3D=3D sql_value_text(argv[2])); + + const unsigned char *collation; + assert(sql_value_type(argv[0]) =3D=3D SQL_INTEGER); + if ((collation =3D sql_value_text(argv[1])) !=3D 0) { + int coll_sz =3D sql_value_bytes(argv[1]); + trim_procedure(context, sql_value_int(argv[0]), = collation, + coll_sz, input_str, input_str_sz); + } else { + return; + } } =20 #ifdef SQL_ENABLE_UNKNOWN_SQL_FUNCTION @@ -1818,12 +1933,9 @@ sqlRegisterBuiltinFunctions(void) FIELD_TYPE_INTEGER), FUNCTION2(likely, 1, 0, 0, noopFunc, SQL_FUNC_UNLIKELY, FIELD_TYPE_INTEGER), - FUNCTION_COLL(ltrim, 1, 1, 0, trimFunc), - FUNCTION_COLL(ltrim, 2, 1, 0, trimFunc), - FUNCTION_COLL(rtrim, 1, 2, 0, trimFunc), - FUNCTION_COLL(rtrim, 2, 2, 0, trimFunc), - FUNCTION_COLL(trim, 1, 3, 0, trimFunc), - FUNCTION_COLL(trim, 2, 3, 0, trimFunc), + FUNCTION_COLL(trim, 1, 3, 0, trim_func_one_arg), + FUNCTION_COLL(trim, 2, 3, 0, trim_func_two_args), + FUNCTION_COLL(trim, 3, 3, 0, trim_func_three_args), FUNCTION(min, -1, 0, 1, minmaxFunc, FIELD_TYPE_SCALAR), FUNCTION(min, 0, 0, 1, 0, FIELD_TYPE_SCALAR), AGGREGATE2(min, 1, 0, 1, minmaxStep, minMaxFinalize, diff --git a/src/box/sql/parse.y b/src/box/sql/parse.y index 099daf512..985d33605 100644 --- a/src/box/sql/parse.y +++ b/src/box/sql/parse.y @@ -1032,6 +1032,51 @@ expr(A) ::=3D CAST(X) LP expr(E) AS typedef(T) = RP(Y). { sqlExprAttachSubtrees(pParse->db, A.pExpr, E.pExpr, 0); } %endif SQL_OMIT_CAST + +expr(A) ::=3D TRIM(X) LP trim_operands(Y) RP(E). { + A.pExpr =3D sqlExprFunction(pParse, Y, &X); + spanSet(&A, &X, &E); + } + +%type trim_operands {struct ExprList *} +%destructor trim_operands { sql_expr_list_delete(pParse->db, $$); } + +trim_operands(A) ::=3D trim_from_clause(F) expr(Y). { + A =3D sql_expr_list_append(pParse->db, F, Y.pExpr); +} + +trim_operands(A) ::=3D expr(Y). { + A =3D sql_expr_list_append(pParse->db, NULL, Y.pExpr); +} + +%type trim_from_clause {struct ExprList *} +%destructor trim_from_clause { sql_expr_list_delete(pParse->db, $$); } + +trim_from_clause(A) ::=3D expr(Y) FROM. { + A =3D sql_expr_list_append(pParse->db, NULL, Y.pExpr); +} + +trim_from_clause(A) ::=3D trim_specification(N) trim_character(Y) FROM. = { + struct Expr *p =3D sql_expr_new_dequoted(pParse->db, TK_INTEGER, + &sqlIntTokens[N]); + A =3D sql_expr_list_append(pParse->db, NULL, p); + if (Y !=3D NULL) { + A =3D sql_expr_list_append(pParse->db, A, Y); + } +} + +%type trim_character {struct Expr *} +%destructor trim_character {sql_expr_delete(pParse->db, $$, false);} + +trim_character(A) ::=3D . { A =3D NULL; } +trim_character(A) ::=3D expr(X). { A =3D X.pExpr; } + +%type trim_specification {int} + +trim_specification(A) ::=3D LEADING. {A =3D 1;} +trim_specification(A) ::=3D TRAILING. {A =3D 2;} +trim_specification(A) ::=3D BOTH. {A =3D 3;} + expr(A) ::=3D id(X) LP distinct(D) exprlist(Y) RP(E). { if( Y && Y->nExpr>pParse->db->aLimit[SQL_LIMIT_FUNCTION_ARG] ){ const char *err =3D diff --git a/src/box/sql/parse_def.c b/src/box/sql/parse_def.c index 49c76a326..aa1323cb2 100644 --- a/src/box/sql/parse_def.c +++ b/src/box/sql/parse_def.c @@ -34,7 +34,9 @@ =20 const struct Token sqlIntTokens[] =3D { {"0", 1, false}, - {"1", 1, false} + {"1", 1, false}, + {"2", 1, false}, + {"3", 1, false}, }; =20 void diff --git a/src/box/sql/parse_def.h b/src/box/sql/parse_def.h index a1af2bacd..5899a7e4e 100644 --- a/src/box/sql/parse_def.h +++ b/src/box/sql/parse_def.h @@ -87,7 +87,7 @@ struct Token { bool isReserved; }; =20 -/** Constant tokens for values 0 and 1. */ +/** Constant tokens for integer values. */ extern const struct Token sqlIntTokens[]; =20 /** Generate a Token object from a string. */ diff --git a/test/sql-tap/badutf1.test.lua = b/test/sql-tap/badutf1.test.lua index d104efaa9..d32bafae0 100755 --- a/test/sql-tap/badutf1.test.lua +++ b/test/sql-tap/badutf1.test.lua @@ -302,7 +302,7 @@ test:do_test( test:do_test( "badutf-4.1", function() - return test:execsql2("SELECT = hex(trim('\x80\x80\x80\xf0\x80\x80\x80\xff','\x80\xff')) AS x") + return test:execsql2("SELECT hex(trim('\x80\xff' FROM = '\x80\x80\x80\xf0\x80\x80\x80\xff')) AS x") end, { -- "X", "F0" @@ -312,7 +312,7 @@ test:do_test( test:do_test( "badutf-4.2", function() - return test:execsql2("SELECT = hex(ltrim('\x80\x80\x80\xf0\x80\x80\x80\xff','\x80\xff')) AS x") + return test:execsql2("SELECT hex(trim(LEADING '\x80\xff' FROM = '\x80\x80\x80\xf0\x80\x80\x80\xff')) AS x") end, { -- "X", "F0808080FF" @@ -322,7 +322,7 @@ test:do_test( test:do_test( "badutf-4.3", function() - return test:execsql2("SELECT = hex(rtrim('\x80\x80\x80\xf0\x80\x80\x80\xff','\x80\xff')) AS x") + return test:execsql2("SELECT hex(trim(TRAILING '\x80\xff' FROM = '\x80\x80\x80\xf0\x80\x80\x80\xff')) AS x") end, { -- "X", "808080F0" @@ -332,7 +332,7 @@ test:do_test( test:do_test( "badutf-4.4", function() - return test:execsql2("SELECT = hex(trim('\x80\x80\x80\xf0\x80\x80\x80\xff','\xff\x80')) AS x") + return test:execsql2("SELECT hex(trim('\xff\x80' FROM = '\x80\x80\x80\xf0\x80\x80\x80\xff')) AS x") end, { -- "X", "808080F0808080FF" @@ -342,7 +342,7 @@ test:do_test( test:do_test( "badutf-4.5", function() - return test:execsql2("SELECT = hex(trim('\xff\x80\x80\xf0\x80\x80\x80\xff','\xff\x80')) AS x") + return test:execsql2("SELECT hex(trim('\xff\x80' FROM = '\xff\x80\x80\xf0\x80\x80\x80\xff')) AS x") end, { -- "X", "80F0808080FF" @@ -352,7 +352,7 @@ test:do_test( test:do_test( "badutf-4.6", function() - return test:execsql2("SELECT = hex(trim('\xff\x80\xf0\x80\x80\x80\xff','\xff\x80')) AS x") + return test:execsql2("SELECT hex(trim('\xff\x80' FROM = '\xff\x80\xf0\x80\x80\x80\xff')) AS x") end, { -- "X", "F0808080FF" @@ -362,7 +362,7 @@ test:do_test( test:do_test( "badutf-4.7", function() - return test:execsql2("SELECT = hex(trim('\xff\x80\xf0\x80\x80\x80\xff','\xff\x80\x80')) AS x") + return test:execsql2("SELECT hex(trim('\xff\x80\x80' FROM = '\xff\x80\xf0\x80\x80\x80\xff')) AS x") end, { -- "X", "FF80F0808080FF" diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua index 251cc3534..8fe04fab1 100755 --- a/test/sql-tap/func.test.lua +++ b/test/sql-tap/func.test.lua @@ -1,6 +1,6 @@ #!/usr/bin/env tarantool test =3D require("sqltester") -test:plan(14586) +test:plan(14590) =20 --!./tcltestrunner.lua -- 2001 September 15 @@ -1912,37 +1912,37 @@ test:do_test( test:do_catchsql_test( "func-22.1", [[ - SELECT trim(1,2,3) + SELECT TRIM(1,2,3) ]], { -- - 1, "wrong number of arguments to function TRIM()" + 1, "Syntax error near ','" -- }) =20 test:do_catchsql_test( "func-22.2", [[ - SELECT ltrim(1,2,3) + SELECT LTRIM(1,2,3) ]], { -- - 1, "wrong number of arguments to function LTRIM()" + 1, "Function 'LTRIM' does not exist" -- }) =20 test:do_catchsql_test( "func-22.3", [[ - SELECT rtrim(1,2,3) + SELECT RTRIM(1,2,3) ]], { -- - 1, "wrong number of arguments to function RTRIM()" + 1, "Function 'RTRIM' does not exist" -- }) =20 test:do_execsql_test( "func-22.4", [[ - SELECT trim(' hi '); + SELECT TRIM(' hi '); ]], { -- "hi" @@ -1952,7 +1952,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.5", [[ - SELECT ltrim(' hi '); + SELECT TRIM(LEADING FROM ' hi '); ]], { -- "hi " @@ -1962,7 +1962,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.6", [[ - SELECT rtrim(' hi '); + SELECT TRIM(TRAILING FROM ' hi '); ]], { -- " hi" @@ -1972,7 +1972,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.7", [[ - SELECT trim(' hi ','xyz'); + SELECT TRIM('xyz' FROM ' hi '); ]], { -- " hi " @@ -1982,7 +1982,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.8", [[ - SELECT ltrim(' hi ','xyz'); + SELECT TRIM(LEADING 'xyz' FROM ' hi '); ]], { -- " hi " @@ -1992,7 +1992,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.9", [[ - SELECT rtrim(' hi ','xyz'); + SELECT TRIM(TRAILING 'xyz' FROM ' hi '); ]], { -- " hi " @@ -2002,7 +2002,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.10", [[ - SELECT trim('xyxzy hi zzzy','xyz'); + SELECT TRIM('xyz' FROM 'xyxzy hi zzzy'); ]], { -- " hi " @@ -2012,7 +2012,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.11", [[ - SELECT ltrim('xyxzy hi zzzy','xyz'); + SELECT TRIM(LEADING 'xyz' FROM 'xyxzy hi zzzy'); ]], { -- " hi zzzy" @@ -2022,7 +2022,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.12", [[ - SELECT rtrim('xyxzy hi zzzy','xyz'); + SELECT TRIM(TRAILING 'xyz' FROM 'xyxzy hi zzzy'); ]], { -- "xyxzy hi " @@ -2032,7 +2032,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.13", [[ - SELECT trim(' hi ',''); + SELECT TRIM('' FROM ' hi '); ]], { -- " hi " @@ -2043,7 +2043,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.14", [[ - SELECT hex(trim(x'c280e1bfbff48fbfbf6869',x'6162e1bfbfc280')) + SELECT hex(TRIM(x'6162e1bfbfc280' FROM = x'c280e1bfbff48fbfbf6869')) ]], { -- "F48FBFBF6869" @@ -2052,8 +2052,8 @@ test:do_execsql_test( =20 test:do_execsql_test( "func-22.15", - [[SELECT hex(trim(x'6869c280e1bfbff48fbfbf61', - x'6162e1bfbfc280f48fbfbf'))]], { + [[SELECT hex(TRIM(x'6162e1bfbfc280f48fbfbf' + FROM x'6869c280e1bfbff48fbfbf61'))]], { -- "6869" -- @@ -2062,7 +2062,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.16", [[ - SELECT hex(trim(x'ceb1ceb2ceb3',x'ceb1')); + SELECT hex(TRIM(x'ceb1' FROM x'ceb1ceb2ceb3')); ]], { -- "CEB2CEB3" @@ -2073,7 +2073,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.20", [[ - SELECT typeof(trim(NULL)); + SELECT typeof(TRIM(NULL)); ]], { -- "null" @@ -2083,7 +2083,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.21", [[ - SELECT typeof(trim(NULL,'xyz')); + SELECT typeof(TRIM('xyz' FROM NULL)); ]], { -- "null" @@ -2093,7 +2093,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.22", [[ - SELECT typeof(trim('hello',NULL)); + SELECT typeof(TRIM(NULL FROM 'hello')); ]], { -- "null" @@ -2105,7 +2105,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.23", [[ - SELECT TRIM(X'004100', X'00'); + SELECT TRIM(X'00' FROM X'004100'); ]], { -- "A" @@ -2115,7 +2115,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.24", [[ - SELECT TRIM(X'004100', X'0000'); + SELECT TRIM(X'0000' FROM X'004100'); ]], { -- "A" @@ -2125,7 +2125,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.25", [[ - SELECT TRIM(X'004100', X'0042'); + SELECT TRIM(X'0042' FROM X'004100'); ]], { -- "A" @@ -2135,7 +2135,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.26", [[ - SELECT TRIM(X'00004100420000', X'00'); + SELECT TRIM(X'00' FROM X'00004100420000'); ]], { -- "A\0B" @@ -2145,7 +2145,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.27", [[ - SELECT LTRIM(X'004100', X'00'); + SELECT TRIM(LEADING X'00' FROM X'004100'); ]], { -- "A\0" @@ -2155,7 +2155,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.28", [[ - SELECT LTRIM(X'004100', X'0000'); + SELECT TRIM(LEADING X'0000' FROM X'004100'); ]], { -- "A\0" @@ -2165,7 +2165,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.29", [[ - SELECT LTRIM(X'004100', X'0042'); + SELECT TRIM(LEADING X'0042' FROM X'004100'); ]], { -- "A\0" @@ -2175,7 +2175,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.30", [[ - SELECT LTRIM(X'00004100420000', X'00'); + SELECT TRIM(LEADING X'00' FROM X'00004100420000'); ]], { -- "A\0B\0\0" @@ -2185,7 +2185,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.31", [[ - SELECT RTRIM(X'004100', X'00'); + SELECT TRIM(TRAILING X'00' FROM X'004100'); ]], { -- "\0A" @@ -2195,7 +2195,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.32", [[ - SELECT RTRIM(X'004100', X'0000'); + SELECT TRIM(TRAILING X'0000' FROM X'004100'); ]], { -- "\0A" @@ -2205,7 +2205,7 @@ test:do_execsql_test( test:do_execsql_test( "func-22.33", [[ - SELECT RTRIM(X'004100', X'0042'); + SELECT TRIM(TRAILING X'0042' FROM X'004100'); ]], { -- "\0A" @@ -2215,13 +2215,55 @@ test:do_execsql_test( test:do_execsql_test( "func-22.34", [[ - SELECT RTRIM(X'00004100420000', X'00'); + SELECT TRIM(TRAILING X'00' FROM X'00004100420000'); ]], { -- "\0\0A\0B" -- }) =20 +-- gh-3879 Check new TRIM() grammar, particularly BOTH keyword and FROM = without +-- any agrs before. LEADING and TRAILING keywords is checked above. + +test:do_execsql_test( + "func-22.35", + [[ + SELECT TRIM(BOTH FROM ' hi '); + ]], { + -- + "hi" + -- + }) +test:do_execsql_test( + "func-22.36", + [[ + SELECT TRIM(BOTH 'xyz' FROM ' hi '); + ]], { + -- + " hi " + -- + }) + +test:do_execsql_test( + "func-22.37", + [[ + SELECT TRIM(BOTH 'xyz' FROM 'xyxzy hi zzzy'); + ]], { + -- + " hi " + -- + }) + +test:do_catchsql_test( + "func-22.38", + [[ + SELECT TRIM(FROM 'xyxzy'); + ]], { + -- + 1, "Syntax error near 'FROM'" + -- + }) + -- This is to test the deprecated sql_aggregate_count() API. -- --test:do_test( @@ -2838,16 +2880,16 @@ test:do_execsql_test( "SELECT TRIM(CHAR(32,00,32,00,32));", {string.char(00,32,00)}) =20 --- LTRIM +-- LEFT TRIM test:do_execsql_test( "func-70", - "SELECT LTRIM(CHAR(32,00,32,00,32));", + "SELECT TRIM(LEADING FROM CHAR(32,00,32,00,32));", {string.char(00,32,00,32)}) =20 --- RTRIM +-- RIGHT TRIM test:do_execsql_test( "func-71", - "SELECT RTRIM(CHAR(32,00,32,00,32));", + "SELECT TRIM(TRAILING FROM CHAR(32,00,32,00,32));", {string.char(32,00,32,00)}) =20 -- GROUP_CONCAT diff --git a/test/sql-tap/with1.test.lua b/test/sql-tap/with1.test.lua index 495aa4ee4..19953e434 100755 --- a/test/sql-tap/with1.test.lua +++ b/test/sql-tap/with1.test.lua @@ -550,7 +550,7 @@ test:do_execsql_test("8.1-mandelbrot", [[ SELECT group_concat( substr(' .+*#', 1+min(iter/7,4), 1), '')=20 FROM m2 GROUP BY cy ) - SELECT group_concat(rtrim(t),x'0a') FROM a; + SELECT group_concat(trim(TRAILING FROM t),x'0a') FROM a; ]], { -- <8.1-mandelbrot> [[ ....#