Tarantool development patches archive
 help / color / mirror / Atom feed
From: "Ivan Koptelov" <ivan.koptelov@tarantool.org>
To: tarantool-patches@freelists.org
Cc: "Vladislav Shpilevoy" <v.shpilevoy@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH] sql: add index_def to struct Index
Date: Fri, 25 May 2018 19:19:38 +0300	[thread overview]
Message-ID: <1527265178.779881612@f401.i.mail.ru> (raw)
In-Reply-To: <23542530-7a35-0cf5-9c6b-7d60afdb4389@tarantool.org>


[-- Attachment #1.1: Type: text/plain, Size: 9146 bytes --]




>Вторник, 22 мая 2018, 22:10 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>:
>
>Hello. Thanks for the patch! See 26 comments below.
>
>On 22/05/2018 19:03, Ivan Koptelov wrote:
>> Currently there is a problem with
>> non-deterministic nullability in
>> index_def. To see the error just
>> uncomment printfs on build.c 2999
>> and run any sql-tap test that
>> has to do with indexes, e.g
>> distinct.test.lua.
>> github branch:  https://github.com/tarantool/tarantool/tree/sb/gh-3369-use-index-def-in-select-and-where
>> ---
>>   src/box/sql/build.c     | 179 +++++++++++++++++++++++++++++++++++++++++++++++-
>>   src/box/sql/sqliteInt.h |   2 +
>>   2 files changed, 179 insertions(+), 2 deletions(-)
>> 
>> diff --git a/src/box/sql/build.c b/src/box/sql/build.c
>> index 742f8a572..6bd70fa9f 100644
>> --- a/src/box/sql/build.c
>> +++ b/src/box/sql/build.c
>> @@ -43,6 +43,7 @@
>>    *     COMMIT
>>    *     ROLLBACK
>>    */
>> +#include <ctype.h>
>
>1. See the comment 2 and remove it, please.
>
>>   #include "sqliteInt.h"
>>   #include "vdbeInt.h"
>>   #include "tarantoolInt.h"
>> @@ -2841,9 +2842,173 @@ index_is_unique(Index *idx)
>>   	return tnt_index->def->opts.is_unique;
>>   }
>> 
>> +/*
>> + * In CREATE INDEX requests user may set binary collation with COLLATE
>> + * [BINARY, binary, "binary" and "bInarY"] This function is to handle all
>> + * these cases.
>> + */
>> +bool is_binary_collation(const char *coll_name)
>> +{
>> +	char *upper_case_name =
>> +		malloc(sizeof(*coll_name) * strlen(coll_name));
>> +	for (int i = 0; i < (int) strlen(coll_name); i++) {
>> +		upper_case_name[i] = toupper(coll_name[i]);
>> +	}
>> +
>> +	bool res = (strcmp(upper_case_name, "BINARY") == 0);
>> +	free(upper_case_name);
>> +	return res;
>
>2. All this function is identical to simple strcasecmp(coll_name, "binary").
>
>> +}
>> +
>> +//bool is_index_equal(Index * old, struct index_def *new)
>> +//{
>> +//	assert(strcmp(old->zName, new->name);
>> +//	return true;
>> +//}
>
>3. Garbage diff.
>
>> +
>> +struct index_def *create_index_def(Table *pTab, const char *name,
>> +				   uint32_t name_len, int onError,
>> +				   ExprList *pList, u8 idxType)
>
>4. According to Tarantool code style you should wrap line after return type
>definition. And please, use snake_case for variable names.
>
>> +{
>> +	struct space_def *space_def = pTab->def;
>> +	struct space *space = space_by_id(space_def->id);
>
>5. space_by_id takes a space from the space cache, but during
>'CREATE TABLE' generation the space is not in the cache. This will
>return NULL.
>
>You must not use space * here at all. I see, that you use it
>in a single place to find a primary index key_def, but you can get it
>from the pTab from the first pIndex - it is always primary one.
>
>> +
>> +	uint32_t iid = 0;
>> +	Index *temp = pTab->pIndex;
>> +	while (temp != NULL) {
>> +		iid++;
>> +		temp = temp->pNext;
>
>6. Bad idea. You must take Index * as the function argument, and initialize its
>index_def.
>
>> +	}
>> +
>> +	struct index_opts *opts = (struct index_opts *) malloc(sizeof(*opts));
>
>7. Why can not you declare it on the stack?
>
>> +	index_opts_create(opts);
>> +	opts->sql = NULL;
>> +
>> +	if (onError == ON_CONFLICT_ACTION_NONE) {
>> +		opts->is_unique = false;
>> +	} else {
>> +		opts->is_unique = true;
>> +	}
>
>8. Do not put 'if' body in {} when it consists of a single line.
>
>9. Here you can simply do this:
>opts->is_unique = onError != ON_CONFLICT_ACTION_NONE;
>
>> +
>> +	struct key_part_def *part_def_tmp = (struct key_part_def *)
>> +		malloc(sizeof(*part_def_tmp) * pList->nExpr);
>
>10. Please, do not do this. Use key_def_new + multiple key_def_set_part.
>You use too many mallocs.
>
>> +
>> +	uint32_t part_count = 0;
>> +	int i = 0;
>> +	struct ExprList_item *pListItem;
>> +
>> +	/*
>> +	 * In case of ordinary index creation we want ot store a
>> +	 * SQL statement of this creation.
>> +	 */
>> +	if (idxType == SQLITE_IDXTYPE_APPDEF) {
>> +		asprintf(&opts->sql,"CREATE INDEX %s ON %s(",
>> +			 name, space_def->name);
>
>11. You should check if asprintf is failed. Or do not use asprintf at
>all. See below how.
>
>> +	}
>> +
>> +	for (i = 0, pListItem = pList->a; i < pList->nExpr; pListItem++, i++) {
>> +		Expr *pCExpr = sqlite3ExprSkipCollate(pListItem->pExpr);
>> +
>> +		if (pCExpr->op == TK_COLUMN) {
>> +			uint32_t fieldno = pCExpr->iColumn;
>> +			part_def_tmp[i].fieldno = fieldno;
>> +			part_def_tmp[i].nullable_action =
>> +				ON_CONFLICT_ACTION_DEFAULT;
>
>12. Wrong - action default is internal thing, that can not be
>set in key_part actually. You must set action to the same value as
>the space field is set.
>
>> +			part_def_tmp[i].type =
>> +				space_def->fields[fieldno].type;
>> +
>> +//			part_def_tmp[i].is_nullable =
>> +//				(fieldno < space_def->field_count ?
>> +//				 space_def->fields[fieldno].is_nullable :
>> +//				 false);
>> +
>> +			part_def_tmp[i].is_nullable =
>> +				table_column_is_nullable(pTab->def, fieldno);
>
>13. Please, do not use table_column_is_nullable. It does space_by_id(def->id),
>but here def->id is not defined when you create a primary index inside a CREATE
>TABLE request.
>
>> +
>> +			part_def_tmp[i].coll_id =
>> +				space_def->fields[fieldno].coll_id ?
>> +				space_def->fields[fieldno].coll_id : COLL_NONE;
>
>14. Do not use integers as booleans.
>
>15. Why not just part_def_tmp[i].coll_id = space_def->fields[fieldno].coll_id?
>Besides, coll_id == 0 is valid collation identifier. It is not the same as COLL_NONE.
>
>> +			part_count++;
>
>16. Why do you need this counter? You already know part count - it is equal
>to pList->nExpr.
>
>> +
>> +			if (opts->sql != NULL) {
>
>17. How can it be NULL?
>
>> +				if (part_count == 1) {
>> +					asprintf(&opts->sql,"%s%s,",opts->sql,
>> +						 space_def->fields[fieldno].name);
>> +				} else {
>> +					asprintf(&opts->sql,"%s %s,",opts->sql,
>> +						 space_def->fields[fieldno].name);
>> +				}
>> +			}
>> +		} else {
>> +			assert(pCExpr->op == TK_COLUMN);
>
>18. So I see this:
>if (something) {
>  ...
>} else {
>  assert(something);
>}
>
>Very strange. This assert fails always.
>
>> +		}
>> +
>> +		if (pListItem->pExpr->op == TK_COLLATE) {
>
>19. Here you completely skipped column initialization when it has
>COLLATE. See sqlite3ExprSkipCollate, the same code on line 3235, and
>how COLLATE is stored in Expr tree (it replaces the original Expr making
>it to be COLLATE's chils). This is why here pList->nExpr is the same as
>part count.
>
>> +			const char *coll_name = pListItem->pExpr->u.zToken;
>> +			assert(coll_name != NULL);
>> +			size_t coll_name_len = strlen(coll_name);
>> +			struct coll *coll = coll_by_name(coll_name, coll_name_len);
>> +
>> +			if (is_binary_collation(coll_name)) {
>> +				part_def_tmp[i].coll_id = COLL_NONE;
>> +			} else {
>> +				assert(coll != NULL);
>
>20. Why? Collation can be not found. Use sqlite3GetCollSeq. But before do
>rebase on the latest 2.0. You branch looks very out of date. Many things
>were changed after the latest 1.10 into 2.0 merge.
>
>> +				part_def_tmp[i].coll_id = coll->id;
>> +			}
>> +
>> +			if (opts->sql != NULL) {
>
>21. How can it be NULL?
>
>> +				/*
>> +				 * If index part has a collation then
>> +				 * corresponding part of statement should
>> +				 * be changed from 'column_name,' to
>> +				 * 'column_name COLLATE collation_name,'.
>> +				 */
>> +				opts->sql[strlen(opts->sql) - 1] = ' ';
>> +				asprintf(&opts->sql,"%sCOLLATE %s,", opts->sql,
>> +					 coll_name);
>
>22. Here and in all others asprintf you have leak - you reset opts->sql with new
>malloc (called in asprintf), but do not free the original. And please, use region
>instead. You can accumulate SQL parts via multiple region_alloc + single
>region_join + strdup at the end. It makes leaks be impossible, and does not
>require any malloc + free.
>
>> +			}
>> +		}
>> +		if (pListItem->pExpr->op != TK_COLUMN &&
>> +		    pListItem->pExpr->op != TK_COLLATE) {
>> +			printf("error %d\n", pListItem->pExpr->op);
>> +		}
>> +	}
>> +
>> +	if (idxType == SQLITE_IDXTYPE_APPDEF) {
>> +		opts->sql[strlen(opts->sql) - 1] = ')';
>> +	}
>> +
>> +	struct key_part_def *part_def = (struct key_part_def *)
>> +		malloc(sizeof(*part_def) * part_count + 1);
>> +
>> +	memcpy(part_def, part_def_tmp, sizeof(*part_def) * part_count);
>> +	free(part_def_tmp);
>> +
>> +	struct key_def *key_def = key_def_new_with_parts(part_def, part_count);
>> +	assert(key_def != NULL);
>
>23. All this things can be removed by usage of key_def_new +
>multiple key_def_set_part.
>
>> +
>> +	struct key_def *pk_key_def;
>> +	if (idxType == SQLITE_IDXTYPE_APPDEF) {
>> +		pk_key_def = space_index_key_def(space, 0);
>
>24. You must not use a space here.
>
>> +	} else {
>> +		pk_key_def = NULL;
>> +	}
>> +
>> +	enum index_type default_type = TREE;
>> +
>> +	printf("space id %d index name %s index iid %d\n", space_def->id, name, iid);
>> +	printf("nullables\n");
>> +	for (uint32_t j = 0; j < key_def->part_count; j++){
>> +		printf("%d\n", key_part_is_nullable(&key_def->parts[i]));
>
Fixed all of the comments' issues.

-- 
Ivan Koptelov

[-- Attachment #1.2: Type: text/html, Size: 11473 bytes --]

[-- Attachment #2: 0001-add-index_def-to-Index.patch --]
[-- Type: application/x-patch, Size: 6096 bytes --]

      reply	other threads:[~2018-05-25 16:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-22 16:03 [tarantool-patches] " Ivan Koptelov
2018-05-22 16:30 ` [tarantool-patches] " Kirill Yukhin
2018-05-22 19:10 ` Vladislav Shpilevoy
2018-05-25 16:19   ` Ivan Koptelov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1527265178.779881612@f401.i.mail.ru \
    --to=ivan.koptelov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='[tarantool-patches] Re: [PATCH] sql: add index_def to struct Index' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox