Re: [Tarantool-patches] [PATCH v2 2/2] sql: don't modify column names

Tarantool development patches archive
 help / color / mirror / Atom feed

From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Roman Khabibov <roman.habibov@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v2 2/2] sql: don't modify column names
Date: Fri, 1 May 2020 21:39:48 +0200	[thread overview]
Message-ID: <db265c79-4885-65ea-8de1-ec3e856e7aa9@tarantool.org> (raw)
In-Reply-To: <20200426230900.23258-3-roman.habibov@tarantool.org>

Thanks for the patch!

How is it related to 3962? Seems like you could send it separately.

See 7 comments below.

On 27/04/2020 01:09, Roman Khabibov wrote:
> Don't print "_1" to the end of column name, if this name is

1. That is not only about _1. It is about all _i.

Also please elaborate when was it printed, and why should not.

> duplicated within the expression list during <FROM> expanding.
> 
> Colses #4545

2. Colses -> Closes.

> ---
>  src/box/sql/select.c                         |  30 +---
>  test/sql-tap/colname.test.lua                | 137 +------------------
>  test/sql-tap/view.test.lua                   |   2 +-
>  test/sql/boolean.result                      |   6 +-
>  test/sql/gh-4104-view-access-check.result    |   2 +-
>  test/sql/gh-4104-view-access-check.test.lua  |   2 +-
>  test/sql/gh-4545-no-col-name-modify.result   |  33 +++++
>  test/sql/gh-4545-no-col-name-modify.test.sql |  10 ++
>  8 files changed, 51 insertions(+), 171 deletions(-)
>  create mode 100644 test/sql/gh-4545-no-col-name-modify.result
>  create mode 100755 test/sql/gh-4545-no-col-name-modify.test.sql
> 
> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
> index d65266586..e6e5d3cd4 100644
> --- a/src/box/sql/select.c
> +++ b/src/box/sql/select.c
> @@ -1895,13 +1895,9 @@ sqlColumnsFromExprList(Parse * parse, ExprList * expr_list,
>  {
>  	/* Database connection */
>  	sql *db = parse->db;
> -	u32 cnt;		/* Index added to make the name unique */
>  	Expr *p;		/* Expression for a single result column */
>  	char *zName;		/* Column name */
> -	int nName;		/* Size of name in zName[] */
> -	Hash ht;		/* Hash table of column names */
>  
> -	sqlHashInit(&ht);

3. Since this commit sqlHash becomes unused. You can remove it in a next commit.

>  	uint32_t column_count =
>  		expr_list != NULL ? (uint32_t)expr_list->nExpr : 0;
>  	/*
> @@ -1953,32 +1949,9 @@ sqlColumnsFromExprList(Parse * parse, ExprList * expr_list,
>  		}
>  		if (zName == NULL) {
>  			uint32_t idx = ++parse->autoname_i;
> -			zName = sqlDbStrDup(db, sql_generate_column_name(idx));
> -		} else {
> -			zName = sqlDbStrDup(db, zName);
> -		}
> -
> -		/* Make sure the column name is unique.  If the name is not unique,
> -		 * append an integer to the name so that it becomes unique.
> -		 */
> -		cnt = 0;
> -		while (zName && sqlHashFind(&ht, zName) != 0) {
> -			nName = sqlStrlen30(zName);
> -			if (nName > 0) {
> -				int j;
> -				for (j = nName - 1;
> -				     j > 0 && sqlIsdigit(zName[j]); j--);
> -				if (zName[j] == '_')
> -					nName = j;
> -			}
> -			zName =
> -			    sqlMPrintf(db, "%.*z_%u", nName, zName, ++cnt);
> +			zName = (char *) sql_generate_column_name(idx);

4. Change zName type to 'const char *' and remove the cast, please.

>  		}
>  		size_t name_len = strlen(zName);
> -		void *field = &space_def->fields[i];
> -		if (zName != NULL &&
> -		    sqlHashInsert(&ht, zName, field) == field)
> -			sqlOomFault(db);
>  		space_def->fields[i].name = region_alloc(region, name_len + 1);
>  		if (space_def->fields[i].name == NULL) {
>  			sqlOomFault(db);> diff --git a/test/sql/boolean.result b/test/sql/boolean.result
> index 51ec5820b..b92ba5bf2 100644
> --- a/test/sql/boolean.result
> +++ b/test/sql/boolean.result
> @@ -690,16 +690,16 @@ SELECT * FROM (SELECT t4.i, t5.i, a, b FROM t4, t5 WHERE a = false OR b = true);
>   | - metadata:
>   |   - name: I
>   |     type: integer
> - |   - name: I_1
> + |   - name: I
>   |     type: integer
>   |   - name: A
>   |     type: boolean
>   |   - name: B
>   |     type: boolean
>   |   rows:
> - |   - [100, 1, false, true]
> + |   - [100, 100, false, true]
> + |   - [100, 100, false, false]
>   |   - [100, 100, false, false]
> - |   - [100, 101, false, false]

5. Why did the result change? The selected columns were different:
t4.i and t5.i. They contained different values, it seems. Now the
select returns t4.i t4.i.

>   | ...
>  
>  -- Check VIEW.
> diff --git a/test/sql/gh-4545-no-col-name-modify.result b/test/sql/gh-4545-no-col-name-modify.result
> new file mode 100644
> index 000000000..be452488b
> --- /dev/null
> +++ b/test/sql/gh-4545-no-col-name-modify.result
> @@ -0,0 +1,33 @@
> +-- test-run result file version 2
> +CREATE TABLE t1 (a INT PRIMARY KEY)
> + | ---
> + | - row_count: 1
> + | ...
> +CREATE TABLE t2 (a INT PRIMARY KEY)
> + | ---
> + | - row_count: 1
> + | ...
> +
> +-- Names of columns can be duplicated.
> +SELECT * FROM (SELECT * FROM t1, t2)
> + | ---
> + | - metadata:
> + |   - name: A
> + |     type: integer
> + |   - name: A
> + |     type: integer
> + |   rows: []

6. Please, insert a couple of different values into t1 and t2.
And check if the result contains them both. Not just one column
returned 2 times.

Looks like it doesn't. t1.a is just returned twice. But if I
write 'SELECT * FROM t1, t2', then all is returned ok.

> + | ...
> +
> diff --git a/test/sql/gh-4545-no-col-name-modify.test.sql b/test/sql/gh-4545-no-col-name-modify.test.sql
> new file mode 100755
> index 000000000..d5f5e5a9d
> --- /dev/null
> +++ b/test/sql/gh-4545-no-col-name-modify.test.sql
> @@ -0,0 +1,10 @@
> +CREATE TABLE t1 (a INT PRIMARY KEY)
> +CREATE TABLE t2 (a INT PRIMARY KEY)
> +
> +-- Names of columns can be duplicated.
> +SELECT * FROM (SELECT * FROM t1, t2)
> +
> +-- Make sure that a view with duplicated column names
> +-- can't be created.
> +CREATE VIEW v AS SELECT * FROM t1, t2
> +CREATE VIEW v AS SELECT * FROM t1, (SELECT * FROM t2)
> \ No newline at end of file

7. Please, add an empty line in the end.

next prev parent reply	other threads:[~2020-05-01 19:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200426230900.23258-1-roman.habibov@tarantool.org>
2020-04-26 23:09 ` Roman Khabibov
2020-05-01 19:39   ` Vladislav Shpilevoy [this message]
2020-05-01 19:35 ` [Tarantool-patches] [PATCH v2 0/2] Name columns in a different way Vladislav Shpilevoy
     [not found] ` <20200426230900.23258-2-roman.habibov@tarantool.org>
2020-05-01 19:37   ` [Tarantool-patches] [PATCH v2 1/2] sql: use unify pattern for column names Vladislav Shpilevoy
2020-06-11 14:58     ` Roman Khabibov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db265c79-4885-65ea-8de1-ec3e856e7aa9@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=roman.habibov@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 2/2] sql: don'\''t modify column names' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox