From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp57.i.mail.ru (smtp57.i.mail.ru [217.69.128.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 86C5B41C5DA for ; Sat, 27 Jun 2020 14:50:15 +0300 (MSK) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3594.4.19\)) From: Roman Khabibov In-Reply-To: <3e040934-d6dc-200b-5925-04879fea3e83@tarantool.org> Date: Sat, 27 Jun 2020 14:50:13 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20200611151853.24398-1-roman.habibov@tarantool.org> <20200611151853.24398-2-roman.habibov@tarantool.org> <81127D96-BB22-4F99-A788-047F9327A4AE@tarantool.org> <3e040934-d6dc-200b-5925-04879fea3e83@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v3 1/2] sql: use unify pattern for column names List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy Cc: tarantool-patches@dev.tarantool.org Hi! Thanks for the review. > On Jun 26, 2020, at 00:25, Vladislav Shpilevoy = wrote: >=20 > Hi! Thanks for the fixes! >=20 >> diff --git a/test/sql-tap/colname.test.lua = b/test/sql-tap/colname.test.lua >> index caa61a07a..b5f04a2d7 100755 >> --- a/test/sql-tap/colname.test.lua >> +++ b/test/sql-tap/colname.test.lua >> @@ -635,4 +635,143 @@ test:do_catchsql_test( >> +-- use the first column with name "COLUMN_1" from >> +-- column list. >> +test:do_execsql2_test( >> + "colname-12.16", >> + [[ >> + SELECT column_1, column_1 COLLATE "unicode_ci" FROM j_1 = ORDER BY column_1; >> + ]], { >> + "COLUMN_1",1,"COLUMN_1",1 >=20 > When there is just one row, all the sorting looks the same. So even if > it will work not by first 'column_1', you won't notice. What makes the > test not so useful. Please, make it so it would be clearly visible, = that > the sorting really used 'the first column'. >=20 > Also it looks innatural that you apply "COLLATE "unicode_ci"" to = numbers. > Please, use an expression, which would result into something = applicable to > a number and not looking exactly the same. >=20 > For example, 'column_1 + 1' or '-column_1'. The latter option would be > especially useful to check how sorting changes, when you select >=20 > SELECT column_1, -column_1 FROM j_1 ORDER BY column_1; >=20 > Or >=20 > SELECT -column_1, column_1 FROM j_1 ORDER BY column_1; >=20 > If really the first matched column name is used, then the results > should be different. I was mistaken when I said that the first column is used. When an identifier from ORDER BY is resolved, matching with aliases is first checked (resolveOrderGroupBy() in resolve.c). See an example below. tarantool> box.execute("CREATE TABLE t(a INT PRIMARY KEY, b INT)") --- - row_count: 1 ... tarantool> box.execute("INSERT INTO t VALUES(1, -1);") --- - row_count: 1 ... tarantool> box.execute("INSERT INTO t VALUES(2, -2);") --- - row_count: 1 ... tarantool> box.execute("SELECT a, b AS a FROM t ORDER BY a;") --- - metadata: - name: A type: integer - name: A type: integer rows: - [2, -2] - [1, -1] ... tarantool> box.execute("SELECT a, b AS a FROM t ORDER BY b;") --- - metadata: - name: A type: integer - name: A type: integer rows: - [2, -2] - [1, -1] =E2=80=A6 In my patch, I used the zName field of the struct ExprList_item, which is also used for aliases. But I did not find words in the code that this field is only for aliases ( clause), therefore I considered it legitimate to put auto names in it. PostgreSQL would throw an error in the case above: "ambiguous names=E2=80=9D= . But it works for us. Is it a bug or not? Perhaps yes. In any case, this does not apply to the patch and requires a separate discussion, so I just delete this test. commit ed196a4446177eb14aa1b86a382c32416edb5794 Author: Roman Khabibov Date: Thu Mar 5 12:48:58 2020 +0300 sql: unify pattern for column names =20 Name resulting columns generated by an expression or construction by the "COLUMN_N" pattern. =20 Closes #3962 =20 @TarantoolBot document Title: Column naming in SQL =20 Now, every auto generated column is named by the "COLUMN_N" pattern, where N is the number of generated column in a query (starting from 1). Auto generated column is a column in a query result generated by an expression or a column from construction. =20 Examples: ``` box.execute("VALUES(1, 2, 3);") --- - metadata: - name: COLUMN_1 type: integer - name: COLUMN_2 type: integer - name: COLUMN_3 type: integer rows: - [1, 2, 3] ... box.execute("SELECT * FROM (VALUES (1+1, 1+1));") --- - metadata: - name: COLUMN_1 type: integer - name: COLUMN_2 type: integer rows: - [2, 2] ... box.execute("SELECT 1+1, 1+1;") --- - metadata: - name: COLUMN_1 type: integer - name: COLUMN_2 type: integer rows: - [2, 2] ... ``` =20 Here, the expression "mycol + 1" generates a new column, so that it is the first auto generated resulting column will be named as "COLUMN_1". ``` tarantool> CREATE TABLE test (mycol INT PRIMARY KEY); --- - row_count: 1 ... =20 tarantool> SELECT mycol, mycol + 1 FROM test; --- - metadata: - name: MYCOL type: integer - name: COLUMN_1 type: integer rows: [] ... ``` Note that you can use generated names already within the query, e.g. in clause. ``` tarantool> SELECT mycol, mycol + 1 FROM test ORDER BY column_1; --- - metadata: - name: MYCOL type: integer - name: COLUMN_1 type: integer rows: [] ... ``` =20 It should also be noted that if you use column names similar to the "COLUMN_N" pattern, you can get the same names as a result: =20 ``` tarantool> CREATE TABLE test (column_1 SCALAR PRIMARY KEY); --- - row_count: 1 ... =20 tarantool> INSERT INTO test VALUES(1); --- - row_count: 1 ... =20 tarantool> SELECT column_1, column_1 COLLATE "unicode_ci" FROM test; --- - metadata: - name: COLUMN_1 type: scalar - name: COLUMN_1 type: scalar rows: - [1, 1] ... ``` diff --git a/src/box/sql/select.c b/src/box/sql/select.c index 4b069addb..26c735ed7 100644 --- a/src/box/sql/select.c +++ b/src/box/sql/select.c @@ -1854,14 +1854,14 @@ generate_column_metadata(struct Parse *pParse, = struct SrcList *pTabList, } } else { const char *z =3D NULL; - if (colname !=3D NULL) + if (colname !=3D NULL) { z =3D colname; - else if (span !=3D NULL) - z =3D span; - else - z =3D tt_sprintf("column%d", i + 1); + } else { + uint32_t idx =3D ++pParse->autoname_i; + z =3D sql_generate_column_name(idx); + } vdbe_metadata_set_col_name(v, i, z); - if (is_full_meta && colname !=3D NULL) + if (is_full_meta) vdbe_metadata_set_col_span(v, i, span); } } @@ -1897,7 +1897,6 @@ sqlColumnsFromExprList(Parse * parse, ExprList * = expr_list, /* Database connection */ sql *db =3D parse->db; u32 cnt; /* Index added to make the name unique = */ - Expr *p; /* Expression for a single result column = */ char *zName; /* Column name */ int nName; /* Size of name in zName[] */ Hash ht; /* Hash table of column names */ @@ -1929,13 +1928,12 @@ sqlColumnsFromExprList(Parse * parse, ExprList * = expr_list, space_def->field_count =3D column_count; =20 for (uint32_t i =3D 0; i < column_count; i++) { - /* Get an appropriate name for the column + /* + * Check if the column contains an "AS " + * phrase. */ - p =3D sqlExprSkipCollate(expr_list->a[i].pExpr); - if ((zName =3D expr_list->a[i].zName) !=3D 0) { - /* If the column contains an "AS " phrase, = use as the name */ - } else { - Expr *pColExpr =3D p; /* The expression that = is the result column name */ + if ((zName =3D expr_list->a[i].zName) =3D=3D 0) { + struct Expr *pColExpr =3D expr_list->a[i].pExpr; struct space_def *space_def =3D NULL; while (pColExpr->op =3D=3D TK_DOT) { pColExpr =3D pColExpr->pRight; @@ -1951,14 +1949,14 @@ sqlColumnsFromExprList(Parse * parse, ExprList * = expr_list, } else if (pColExpr->op =3D=3D TK_ID) { assert(!ExprHasProperty(pColExpr, = EP_IntValue)); zName =3D pColExpr->u.zToken; - } else { - /* Use the original text of the column = expression as its name */ - zName =3D expr_list->a[i].zSpan; } } - if (zName =3D=3D NULL) - zName =3D "_auto_field_"; - zName =3D sqlMPrintf(db, "%s", zName); + if (zName =3D=3D NULL) { + uint32_t idx =3D ++parse->autoname_i; + zName =3D sqlDbStrDup(db, = sql_generate_column_name(idx)); + } else { + zName =3D sqlDbStrDup(db, zName); + } =20 /* Make sure the column name is unique. If the name is = not unique, * append an integer to the name so that it becomes = unique. @@ -4792,6 +4790,24 @@ selectPopWith(Walker * pWalker, Select * p) } } =20 +/** + * Determine whether to generate a name for @a expr or not. + * + * Auto generated names is needed for every item in a