[tarantool-patches] Re: [PATCH 3/4] sql: refactor usages of table's tuple count

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Apr 24 15:51:53 MSK 2018


See my 3 comments below, thanks.

On 23/04/2018 23:29, Nikita Pettik wrote:
> Count of tuples containing in given space can be calculated as a number of
> tuples in primary index. However, if table represents temporary object
> such as result set of SELECT or VIEW, tuple count can't be precisely
> determined. In this case, default approximation is used: 1 million tuples.
> 
> Part of #3253
> diff --git a/src/box/sql/sqliteInt.h b/src/box/sql/sqliteInt.h
> index 8ca8e808f..d026a213d 100644
> --- a/src/box/sql/sqliteInt.h
> +++ b/src/box/sql/sqliteInt.h
> @@ -2164,7 +2180,6 @@ struct Index {
>   	tRowcnt *aAvgEq;	/* Average nEq values for keys not in aSample */
>   	IndexSample *aSample;	/* Samples of the left-most key */
>   	tRowcnt *aiRowEst;	/* Non-logarithmic stat1 data for this index */
> -	tRowcnt nRowEst0;	/* Non-logarithmic number of rows in the index */
>   };
>   
>   /*
> @@ -2199,6 +2214,16 @@ struct IndexSample {
>   	tRowcnt *anDLt;		/* Est. number of distinct keys less than this sample */
>   };
>   
> +#ifdef DEFAULT_TUPLE_COUNT
> +#undef DEFAULT_TUPLE_COUNT

1. Why do you need these guards? sqliteInt.h already has #ifndef SQLITEINT_H
#define SQLITEINT_H guard. If it does not help, then it is error and must be
fixed with no ifdef + undef for each definition.

> +#endif
> +#define DEFAULT_TUPLE_COUNT 1048576
> +
> +#ifdef DEFAULT_TUPLE_LOG_COUNT
> +#undef DEFAULT_TUPLE_LOG_COUNT
> +#endif
> +#define DEFAULT_TUPLE_LOG_COUNT sqlite3LogEst(DEFAULT_TUPLE_COUNT)

2. In the original code it was 200 to avoid calculation of
sqlite3LogEst(1048576) on each temporary table creation. Lets do the same.

> diff --git a/src/box/sql/where.c b/src/box/sql/where.c
> index 51b53c2df..e6f9ef431 100644
> --- a/src/box/sql/where.c
> +++ b/src/box/sql/where.c
> @@ -1326,8 +1326,22 @@ whereRangeScanEst(Parse * pParse,	/* Parsing & code generating context */
>   			}
>   			/* Determine iLower and iUpper using ($P) only. */
>   			if (nEq == 0) {
> +				/*
> +				 * In this simple case, there are no any
> +				 * equality constraints, so initially all rows
> +				 * are in range.
> +				 */
>   				iLower = 0;
> -				iUpper = p->nRowEst0;
> +				uint32_t space_id =
> +					SQLITE_PAGENO_TO_SPACEID(p->tnum);
> +				struct space *space = space_by_id(space_id);
> +				assert(space != NULL);
> +				uint32_t iid =
> +					SQLITE_PAGENO_TO_INDEXID(p->tnum);
> +				struct index *idx =
> +					space_index(space, iid);
> +				assert(idx != NULL);
> +				iUpper = idx->vtab->size(idx);

3. Lets use index_count. And looks like this case is repeated not the first and
even not the second time. Lets implement a function sql_table_size(), that takes
a table pointer, and does PAGENO_TO... + space_by_id + space_index + index_size.
I would put it in one of the previous patches, where this case appears first.

>   			} else {
>   				/* Note: this call could be optimized away - since the same values must
>   				 * have been requested when testing key $P in whereEqualScanEst().
> @@ -2786,7 +2800,7 @@ whereLoopAddBtree(WhereLoopBuilder * pBuilder,	/* WHERE clause information */
>   		sPk.aiRowLogEst = aiRowEstPk;
>   		sPk.onError = ON_CONFLICT_ACTION_REPLACE;
>   		sPk.pTable = pTab;
> -		aiRowEstPk[0] = pTab->nRowLogEst;
> +		aiRowEstPk[0] = sql_space_tuple_log_count(pTab);
>   		aiRowEstPk[1] = 0;
>   		pFirst = pSrc->pTab->pIndex;
>   		if (pSrc->fg.notIndexed == 0) {
> 




More information about the Tarantool-patches mailing list