From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: roman <roman.habibov@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v3 1/2] sql: use unify pattern for column names Date: Wed, 24 Jun 2020 01:12:25 +0200 [thread overview] Message-ID: <e1e9043a-e48f-97f8-2ec5-71791537b50d@tarantool.org> (raw) In-Reply-To: <b1f19857-21bf-7a21-f870-aab9086a2f22@tarantool.org> Hi! Thanks for the patch! >> 3. What if I specify first column by myself, and the second >> is an expression? Will it be _1 or _2? > > tarantool> CREATE TABLE j (column_1 SCALAR PRIMARY KEY); > --- > - row_count: 1 > ... > > tarantool> INSERT INTO j VALUES(1); > --- > - row_count: 1 > ... > > tarantool> SELECT column_1, column_1 COLLATE "unicode_ci" FROM j ORDER BY column_1; > --- > - metadata: > - name: COLUMN_1 > type: scalar > - name: COLUMN_1 > type: scalar > rows: > - [1, 1] > ... > > If we want that the second expression to be named "COLUMN_2". should we use a > hash table in struct Parse. We will push the names of all columns in the request > to it and generate names without collisions. I don't now anymore ways. > I would leave it as it is now. I didn't say anything about behaviour change. I just asked a simple question, and I don't see an answer here. I will repeat it: what if my query contains 2 columns - one normal column name and the second is an expression. Will the expression be visible as COLUMN_1 or as COLUMN_2 in the result set? SELECT mycol, mycol + 1 FROM test; Will 'mycol + 1' be stored as COLUMN_1 or COLUMN_2? It is a simple question to clarify in the documentation request. Because it will be asked if you don't explicitly say what will happen. >>> @@ -4951,6 +4951,18 @@ selectExpander(Walker * pWalker, Select * p) >>> || (pE->pLeft != 0 && pE->pLeft->op == TK_ID)); >>> if (pE->op == TK_DOT && pE->pRight->op == TK_ASTERISK) >>> break; >>> + /* >>> + * It is necessary to generate auto names for >>> + * expressions before clauses such as ORDER BY, >>> + * GROUP BY will be resolved. These names can be >>> + * found inside these clauses during resolving. >>> + */ >>> + if (pEList->a[k].zName == NULL && pE->op != TK_DOT && >>> + ((pE->flags & EP_Leaf) == 0 || pE->op != TK_ID)) { >>> + uint32_t idx = ++pParse->autoname_i; >>> + pEList->a[k].zName = >>> + sqlDbStrDup(db, sql_generate_column_name(idx)); >> 4. I don't think I understand what is happening here. Looks too >> complicated. Why do you need so many checks, what do these checks >> mean, how are they related to ORDER BY/GROUP BY? And how is it >> related to select expansion (the function is called selectExpander())? >> Why does it work fine when I use normal column names in ORDER BY/GROUP BY, >> and does not work here without this crutch? > Here I name all items (generate names) in expression list if it is needed. > I think this operation can be attributed to "expanding". Expanding is before > name resolving. At the resolving all items must be named. > > You can use normal column names in ORDER BY, GROUP BY etc, I checked it in Postgre > (Postgre names expression the same way). I think it is OK. Can you use auto names in PostgreSQL too? Inside the query. Are they a full featured column name, or just sugar for result set metadata? See 5 comments below. > diff --git a/src/box/sql/select.c b/src/box/sql/select.c > index 4b069ad..7a56136 100644 > --- a/src/box/sql/select.c > +++ b/src/box/sql/select.c > @@ -1989,6 +1987,8 @@ sqlColumnsFromExprList(Parse * parse, ExprList * expr_list, > memcpy(space_def->fields[i].name, zName, name_len); > space_def->fields[i].name[name_len] = '\0'; > } > + sqlDbFree(db, expr_list->a[i].zName); > + expr_list->a[i].zName = zName; 1. I deleted this hunk and nothing changed in the tests. Why do you need it? > } > cleanup: > sqlHashClear(&ht); > @@ -4792,6 +4792,25 @@ selectPopWith(Walker * pWalker, Select * p) > } > } > > +/** > + * Determine whether to generate a name for @a expr or not. > + * > + * Auto generated names is needed for every item in a <SELECT> > + * expression list except asterisks, dots and column names (also > + * if this item hasn't alias). > + * > + * @param expr Expression from expression list to analyze. > + * > + * @retval true/false If item is/isn't needed needed to name. > + */ > +static bool > +is_needed_to_name(struct Expr *expr) { 2. We put { on separate line. > + if (expr->op != TK_ASTERISK && expr->op != TK_DOT && > + ((expr->flags & EP_Leaf) == 0 || expr->op != TK_ID)) > + return true; > + return false; 3. You could simply return the condition itself. It is a boolean anyway. What I don't understand is what is 'leaf'. For example, there is an expression '1 + 2'. Are 1 and 2 leafs? Is '1 + 2' a leaf? What is a definition of the leaf here? > +} > + > /* > * This routine is a Walker callback for "expanding" a SELECT statement. > * "Expanding" means to do the following: > @@ -4941,19 +4960,27 @@ selectExpander(Walker * pWalker, Select * p) > * all tables. > * > * The first loop just checks to see if there are any "*" operators > - * that need expanding. > + * that need expanding and name items of the expression > + * list if needed. > */ > + i = pEList->nExpr; > for (k = 0; k < pEList->nExpr; k++) { 4. Why do you have 2 iterators now? > pE = pEList->a[k].pExpr; > - if (pE->op == TK_ASTERISK) > - break; > + if (pE->op == TK_ASTERISK && i == pEList->nExpr) > + i = k; > assert(pE->op != TK_DOT || pE->pRight != 0); > assert(pE->op != TK_DOT > || (pE->pLeft != 0 && pE->pLeft->op == TK_ID)); > - if (pE->op == TK_DOT && pE->pRight->op == TK_ASTERISK) > - break; > + if (pE->op == TK_DOT && pE->pRight->op == TK_ASTERISK && > + i == pEList->nExpr) > + i = k; > + if (pEList->a[k].zName == NULL && is_needed_to_name(pE)) { > + uint32_t idx = ++pParse->autoname_i; > + pEList->a[k].zName = > + sqlDbStrDup(db, sql_generate_column_name(idx)); > + } > } > - if (k < pEList->nExpr) { > + if (i < pEList->nExpr) { > /* > * If we get here it means the result set contains one or more "*" > * operators that need to be expanded. Loop through each expression 5. I don't see any test for the case when a table contains COLUMN_1, COLUMN_2, etc columns. As real columns. What will happen with the expression names then? This should be investigated how is it done in other DBs, and documented in the docbot request for us. Also probably you may need to ask Peter about it, if the other DBs' cases don't offer anything good enough to blindly copy.
next prev parent reply other threads:[~2020-06-23 23:12 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-11 15:18 [Tarantool-patches] [PATCH v3 0/2] Use-unify-pattern-for-column-names Roman Khabibov 2020-06-11 15:18 ` [Tarantool-patches] [PATCH v3 1/2] sql: use unify pattern for column names Roman Khabibov 2020-06-15 21:59 ` Vladislav Shpilevoy 2020-06-22 21:14 ` roman 2020-06-23 23:12 ` Vladislav Shpilevoy [this message] 2020-06-25 14:35 ` Roman Khabibov 2020-06-25 21:25 ` Vladislav Shpilevoy 2020-06-27 11:50 ` Roman Khabibov 2020-06-29 20:08 ` Vladislav Shpilevoy 2020-06-29 23:46 ` Roman Khabibov 2020-06-30 21:23 ` Vladislav Shpilevoy 2020-07-01 12:45 ` Roman Khabibov 2020-07-01 21:25 ` Vladislav Shpilevoy 2020-07-02 15:55 ` Roman Khabibov 2020-06-11 15:18 ` [Tarantool-patches] [PATCH v3 2/2] sql: print span more properly Roman Khabibov 2020-06-15 22:18 ` Vladislav Shpilevoy 2020-06-22 21:14 ` roman 2020-06-23 23:12 ` Vladislav Shpilevoy 2020-07-02 15:55 ` Roman Khabibov 2020-07-02 19:06 ` Nikita Pettik
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=e1e9043a-e48f-97f8-2ec5-71791537b50d@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=roman.habibov@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v3 1/2] sql: use unify pattern for column names' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox