[tarantool-patches] Re: [PATCH 3/4] sql: refactor usages of table's tuple count
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Tue Apr 24 15:51:53 MSK 2018
See my 3 comments below, thanks.
On 23/04/2018 23:29, Nikita Pettik wrote:
> Count of tuples containing in given space can be calculated as a number of
> tuples in primary index. However, if table represents temporary object
> such as result set of SELECT or VIEW, tuple count can't be precisely
> determined. In this case, default approximation is used: 1 million tuples.
>
> Part of #3253
> diff --git a/src/box/sql/sqliteInt.h b/src/box/sql/sqliteInt.h
> index 8ca8e808f..d026a213d 100644
> --- a/src/box/sql/sqliteInt.h
> +++ b/src/box/sql/sqliteInt.h
> @@ -2164,7 +2180,6 @@ struct Index {
> tRowcnt *aAvgEq; /* Average nEq values for keys not in aSample */
> IndexSample *aSample; /* Samples of the left-most key */
> tRowcnt *aiRowEst; /* Non-logarithmic stat1 data for this index */
> - tRowcnt nRowEst0; /* Non-logarithmic number of rows in the index */
> };
>
> /*
> @@ -2199,6 +2214,16 @@ struct IndexSample {
> tRowcnt *anDLt; /* Est. number of distinct keys less than this sample */
> };
>
> +#ifdef DEFAULT_TUPLE_COUNT
> +#undef DEFAULT_TUPLE_COUNT
1. Why do you need these guards? sqliteInt.h already has #ifndef SQLITEINT_H
#define SQLITEINT_H guard. If it does not help, then it is error and must be
fixed with no ifdef + undef for each definition.
> +#endif
> +#define DEFAULT_TUPLE_COUNT 1048576
> +
> +#ifdef DEFAULT_TUPLE_LOG_COUNT
> +#undef DEFAULT_TUPLE_LOG_COUNT
> +#endif
> +#define DEFAULT_TUPLE_LOG_COUNT sqlite3LogEst(DEFAULT_TUPLE_COUNT)
2. In the original code it was 200 to avoid calculation of
sqlite3LogEst(1048576) on each temporary table creation. Lets do the same.
> diff --git a/src/box/sql/where.c b/src/box/sql/where.c
> index 51b53c2df..e6f9ef431 100644
> --- a/src/box/sql/where.c
> +++ b/src/box/sql/where.c
> @@ -1326,8 +1326,22 @@ whereRangeScanEst(Parse * pParse, /* Parsing & code generating context */
> }
> /* Determine iLower and iUpper using ($P) only. */
> if (nEq == 0) {
> + /*
> + * In this simple case, there are no any
> + * equality constraints, so initially all rows
> + * are in range.
> + */
> iLower = 0;
> - iUpper = p->nRowEst0;
> + uint32_t space_id =
> + SQLITE_PAGENO_TO_SPACEID(p->tnum);
> + struct space *space = space_by_id(space_id);
> + assert(space != NULL);
> + uint32_t iid =
> + SQLITE_PAGENO_TO_INDEXID(p->tnum);
> + struct index *idx =
> + space_index(space, iid);
> + assert(idx != NULL);
> + iUpper = idx->vtab->size(idx);
3. Lets use index_count. And looks like this case is repeated not the first and
even not the second time. Lets implement a function sql_table_size(), that takes
a table pointer, and does PAGENO_TO... + space_by_id + space_index + index_size.
I would put it in one of the previous patches, where this case appears first.
> } else {
> /* Note: this call could be optimized away - since the same values must
> * have been requested when testing key $P in whereEqualScanEst().
> @@ -2786,7 +2800,7 @@ whereLoopAddBtree(WhereLoopBuilder * pBuilder, /* WHERE clause information */
> sPk.aiRowLogEst = aiRowEstPk;
> sPk.onError = ON_CONFLICT_ACTION_REPLACE;
> sPk.pTable = pTab;
> - aiRowEstPk[0] = pTab->nRowLogEst;
> + aiRowEstPk[0] = sql_space_tuple_log_count(pTab);
> aiRowEstPk[1] = 0;
> pFirst = pSrc->pTab->pIndex;
> if (pSrc->fg.notIndexed == 0) {
>
More information about the Tarantool-patches
mailing list