[tarantool-patches] Re: [PATCH v2 1/5] sql: introduce structs assembling DDL arguments during parsing

Mon Mar 25 14:25:04 MSK 2019

Hi! Thanks for the fixes!

On 14/03/2019 21:13, n.pettik wrote:
> 
> 
>> On 12 Mar 2019, at 15:50, Vladislav Shpilevoy <v.shpilevoy at tarantool.org> wrote:
>>
>> Hi! Thanks for the fixes! See my comments below,
>> fixes at the end of the email, and on the branch
>> (it is np/gh-3914-fix-create-index-v2).
>>
>>>> Below I listed the calls you do to create each 'terminal' structure.
>>>>
>>>> 	TABLE RENAME:
>>>> 	alter_entity_def_init
>>>> 	rename_entity_def_init
>>>>
>>>> 	TABLE CREATE
>>>> 	alter_entity_def_init
>>>> 	create_entity_def_init
>>>> 	create_table_def_init
>>>>
>>>> 	TABLE DROP:
>>>> 	alter_entity_def_init
>>>> 	drop_entity_def_init
>>>>
>>>> 	TABLE ADD FK:
>>>> 	alter_entity_def_init
>>>> 	create_entity_def_init
>>>> 	create_constraint_def_init
>>>> 	create_fk_def_init
>>>>
>>>> 	DROP FK:
>>>> 	alter_entity_def_init
>>>> 	drop_entity_def_init
>>>>
>>>> 	DROP INDEX:
>>>> 	alter_entity_def_init
>>>> 	drop_entity_def_init
>>>>
>>>> 	ADD INDEX:
>>>> 	alter_entity_def_init
>>>> 	create_entity_def_init
>>>> 	create_index_def_init
>>>>
>>>> 	CHECK:
>>>>         alter_entity_def_init
>>>>         create_entity_def_init
>>>> 	create_ck_def_init
>>>>
>>>> 	CREATE TRIGGER:
>>>> 	alter_entity_def_init
>>>> 	create_entity_def_init
>>>> 	create_trigger_def_init
>>>>
>>>> 	DROP TRIGGER:
>>>> 	alter_entity_def_init
>>>> 	drop_entity_def_init
>>>>
>>>> Here are some problems with what you said earlier and with
>>>> the whole structure.
>>>>
>>>> Firstly, some of sequences are not finished with a an explicit
>>>> terminal. For example, *all* DROPs are finished with an abstract
>>>> drop_entity_def_init() taking an entity type explicitly. Despite
>>>> the words you said earlier that you initialize everything in steps.
>>>> So in the case of DROP the rule is violated already. I think, that
>>>> you should define static inline helpers in parse_def.h with names
>>>> drop_table_def_init(), drop_index_def_init(), drop_check_def_init()
>>>> etc.
>>> Drop procedure (i.e. arguments filling) is the same for all entities
>>> (this situation is likely to remain unchanged). Does it make any
>>> sense to duplicate code without any necessity? What these
>>> helpers are supposed to do?
>>
>> Nothing functional. Just syntax sugar and API consistency. Otherwise
>> your statement, that we have a tree of structures with terminal nodes,
>> is false. For other nodes you have _table/index/fk etc suffix, but
>> drop is just 'entity'. For me it looks odd. The same for rename_entity.
>>
>> Strictly speaking, your hierarchy of structures is even not a tree -
>> in your code create_table_def and create_view_def store alter_entity_def,
>> but do not initialize nor use it.
>>
>> Additionally, for not named constraints you do not call
>> alter_entity_def_init nor even create_constraint_def_init - you just
>> memset all the structure. So your tree in fact is really corrupted and
>> ill, sorry.
>>
>> Talking of duplication - each wrapper around drop_entity and
>> rename_entity would take 11 simple lines:
>>
>>     struct drop_<term>_def {
>>             struct drop_entity_def base;
>>     }
>>
>>     static inline void
>>     drop_<term>_def_init(struct drop_<term>_def *def,
>>                          struct SrcList *parent_name, struct Token *name,
>>                          bool if_exist)
>>     {
>>             drop_entity_def_init(&def->base, parent_name, name, if_exist,
>>                                  ENTITY_TYPE_<term>);
>>     }
>>
>> For compiler it is nothing - functions are inlined, for binary file
>> as well. For code and naming consistency it is a big deal. IMO. But
>> probably I should stop teaching here - you can leave it as is in terms
>> of list entities or listen to my objectives.
> 
> Just want to hear pros of this approach (cons are obvious).

For me cons are not obvious. Pros - more proper code structure.

>> Anyway you are likely to be a
>> single maintainer of all of this for a long time.
> 
> Sounds like a threat. I’d rather not :)

It is not a threat. It is a fact. At this moment you are a single
not-trainee developer on SQL. And I do not see any forthcoming
changes in that.

> 
>>>> Secondly, some words about create_entity_def_init(). It is the
>>>> most arguable function for a public usage (out of parse_def.h)
>>>> because hardly I can imagine a situation when you parsed a word
>>>> CREATE, but can not parse a next one: TABLE, TRIGGER, CONSTRAINT etc.
>>>> And we see it in the lists above. Below I listed all of this function
>>>> usage cases and how you can replace that function here.
>>>>
>>>> - create_entity_def_init() in create_table ::= rule, line 172. Here
>>>>   it is followed by create_table_def_init() on the next line.
>>>>
>>>> - create_entity_def_init() in cconsname_parse ::= rule line 250.
>>>>   This rule is a prefix for a constraint definition, so this function
>>>>   can be substituted with create_constraint_def_init().
>>>>
>>>> - create_entity_def_init() in cmd ::= rule (about CREATE VIEW),
>>>>   line 400. Here we see, that 1) you missed ENTITY_TYPE_VIEW,
>>>>   2) create_view_def_init() can be called here.
>>>>
>>>> - create_entity_def_init() in cmd ::= rule (about CREATE INDEX),
>>>>   line 1246. Here it is followed by create_index_def_init() on the
>>>>   next line.
>>>>
>>>> - create_entity_def_init() in trigger_decl ::= rule, line 1367.
>>>>   Here it is followed by create_trigger_def_init().
>>>>
>>>> - create_entity_def_init() in cmd ::= rule (about ALTER TABLE ADD
>>>>   CONSTRAINT FOREIGN KEY), line 1500. Here it is followed by
>>>>   create_constraint_def_init().
>>>>
>>>> So as you can see create_entity_def_init() is not necessary to use
>>>> out of parse_def.h. You can put it into
>>>> create_table/index/constraint_def().
>>> Ok, done. See updated version at the end of letter.
>>>> Now about create_constraint_def_init(). In your patch it is always
>>>> followed by create_fk_def_init().
>>> I didn’t get this point. create_constraint_def_init() may come before
>>> create_fk_def_init() as well as before create_index_def/create_ck_def.
>>
>> The only place, where create_constraint_def_init can not be replaced with
>> a next level term, is a rule "cconsname_parse ::= CONSTRAINT nm(X).". But
>> you did not answer below why do you need it here.
>>
>>>> But after you did the comment above
>>>> about create_entity_def_init(), you will have create_constraint_def_init()
>>>> in cconsname_parse ::= CONSTRAINT nm(X) rule. And I do not see any
>>>> concrete reasons why can not you remove it from here and use only
>>>> create_index_def_init(), create_ck_def_init() etc in ccons ::= rule.
>>>> You said, that some of pointers should be nullifed, but what pointers
>>>> and why? What if you do not nullify them right after CONSTRAINT word
>>>> is met?
>>> We had to nullify them since in previous versions members of some
>>> structures might leave uninitialised. Now _init funs fill all members,
>>> so there is no any problem.
>>
>> I see, that you nullify them, but you did not answer my questions. Why
>> can not you postpone that nullification until a next word is met -
>> UNIQUE, PK, FK, CK etc? It would allow you to fold
>> create_constraint_def_init and always use a term def.
>>
>> What is more, you initialize named constraints twice - first you
>> nullify them in cconsname_start with memset, and then call
>> create_constraint_def_init.
>>
>> I did my own investigation of this matter, and managed to remove
>> cconsname_start rule, as well as double init. After that I have
>> cconsname_parse followed by many term constraint rules, and I tried to
>> figure out why can't we just inline cconsname into terms ccons and
>> tcons and remove create_constraint_def_init from public. Especially
>> taking into account that it is not many changes thanks to that:
>> https://github.com/tarantool/tarantool/issues/3820.
>>
>> It means, that we could inline 'CONSTRAINT nm(X)' in 8 places only:
>>
>>     1) "ccons ::= PRIMARY KEY"
>>     2) "ccons ::= UNIQUE"
>>     3) "ccons ::= check_constraint_def"
>>     4) "ccons ::= REFERENCES"
>>
>>      + the same for tcons.
>>
>> But then I realized, that cconsname helps us to keep the name
>> optional. To solve that we need to somehow save nm(X) result until
>> a terminal constraint is reached, or nullify it. I think, that we
>> could add struct Token constraint_name to union of alter entities in
>> struct Parse, and fetch it from there in terminal constraint
>> definitions (PK, UNIQUE, CK, FK). It is possible, because in such
>> a case this union's memory is not occupied by create_constraint_def
>> until term. It will allow us to remove create_constraint_def_init
>> from public usage. And to close the issue 3820 alongside.
> 
> This approach brings us to the very first implementation,
> when name of constraint was saved as Token in struct Parse…

The very first approach was not only about this attribute. Also
it was about struct space *new_space, uint32_t fk_constraint_count,
struct rlist new_fk_constraint, bool is_new_table_autoinc.

>> This is what I wanted to hear from you.
>>
>> Besides, I have some lowlevel comments on the code.
>>
>> 1) Usually we do not pass structures by value, even such small
>> as struct Token.
> 
> Ok, I see you’ve already fixed this point, so skipped.
> 
>> 2) You've moved struct Token into parse_def.h, but kept its
>> routines in sqlite3Int.h. Please, move them too.
> 
> It’s not that easy. For instance, sqlTokenInit() uses sqlStrlen30(),
> which declaration is placed in sqlInt.h. Ok, we can use common
> strlen(). Next, sqlNameFromToken() uses sqlDbStrNDup() and
> sqlNormalizeName(), which in turn again are declared in sqlInt.h
> We can’t include sqlInt.h, otherwise we will get cyclic dependencies.
> Also I wanted to move sqlJoinType() (it is used only in parse.y and
> involves tokens routine), but it uses struct Parse to set error.
> I guess after replacing saving error message in struct Parse with
> diag_set(), we will be able to move it to parse.y
> 
> So, the only things I’ve really managed to move are
> sqlTokenInit() and array sqlIntTokens[]:

Ok, understand. Thanks.

See 4 comments below.

> 
> diff --git a/src/box/sql/parse.y b/src/box/sql/parse.y
> index 996f55d37..aa16cb874 100644
> --- a/src/box/sql/parse.y
> +++ b/src/box/sql/parse.y
> @@ -178,15 +179,16 @@ ifnotexists(A) ::= IF NOT EXISTS. {A = 1;}
>  
>  create_table_args ::= LP columnlist RP(E). {
>    sqlEndTable(pParse,&E,0);
> +  create_table_def_destroy(&pParse->create_table_def);
>  }
>  create_table_args ::= AS select(S). {
>    sqlEndTable(pParse,0,S);

1. Why do not you destroy table def here?

>    sql_select_delete(pParse->db, S);
>  }
> -columnlist ::= columnlist COMMA tconsdef.
> +columnlist ::= columnlist COMMA tcons.
>  columnlist ::= columnlist COMMA columnname carglist.
>  columnlist ::= columnname carglist.
> -columnlist ::= tconsdef.
> +columnlist ::= tcons.
>  columnname(A) ::= nm(A) typedef(Y). {sqlAddColumn(pParse,&A,&Y);}
>  
>  // An IDENTIFIER can be a generic identifier, or one of several
2. As I understand, now table def leaks if parser stops after sqlStartTable
but before sqlEndTable.

3. sql_create_index comment still refers to parse->new_space.

4. Please, write a comment above struct create_table_def create_table_def in
struct Parse why it is not in the union.