[tarantool-patches] Re: [PATCH 2/4] sql: add average tuple size calculation

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Apr 24 15:51:50 MSK 2018


See my 5 comments below, thanks.

On 23/04/2018 23:29, Nikita Pettik wrote:
> Average tuple size can be calculated by dividing space size by index
> tuple count. Thus, there is no need to hold this value as separate
> member. Thus, this patch removes from struct Index and struct Table such
> statistic. Note that now there are no partial indexes, so all indexes
> feature the same average tuple size. Also, commented unstable tests.
> 
> Part of #3253
> ---
>   src/box/sql/analyze.c          | 31 +++++++---------
>   src/box/sql/build.c            | 44 -----------------------
>   src/box/sql/pragma.c           | 20 +++++++++--
>   src/box/sql/select.c           | 26 +++++---------
>   src/box/sql/sqliteInt.h        | 24 +++++++++----
>   src/box/sql/where.c            | 41 +++++++++++----------
>   test/sql-tap/analyze9.test.lua | 81 +++++++++++++++++++++---------------------
>   7 files changed, 118 insertions(+), 149 deletions(-)
> 
> diff --git a/src/box/sql/analyze.c b/src/box/sql/analyze.c
> index 665bfbcb5..7c16a2154 100644
> --- a/src/box/sql/analyze.c
> +++ b/src/box/sql/analyze.c
> @@ -1196,6 +1196,18 @@ sqlite3Analyze(Parse * pParse, Token * pName)
>   		sqlite3VdbeAddOp0(v, OP_Expire);
>   }
>   
> +ssize_t
> +sql_index_tuple_size(struct space *space, struct index *idx)
> +{
> +	assert(space != NULL);
> +	assert(idx != NULL);
> +	assert(idx->def->space_id == space->def->id);
> +	ssize_t tuple_count = idx->vtab->size(idx);
> +	ssize_t space_size = space->vtab->bsize(space);

1. Lets use wrappers: index_size() and space_bsize() - they are defined
already.
> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
> index 0df8a71d4..391b7e0a2 100644
> --- a/src/box/sql/select.c
> +++ b/src/box/sql/select.c
> @@ -1588,20 +1588,19 @@ generateSortTail(Parse * pParse,	/* Parsing context */
>    * the SQLITE_ENABLE_COLUMN_METADATA compile-time option is used.
>    */
>   #ifdef SQLITE_ENABLE_COLUMN_METADATA
> -#define columnType(A,B,C,D,E,F) columnTypeImpl(A,B,D,E,F)
> +#define columnType(A,B,C,D,E) columnTypeImpl(A,B,D,E)
>   #else				/* if !defined(SQLITE_ENABLE_COLUMN_METADATA) */
> -#define columnType(A,B,C,D,E,F) columnTypeImpl(A,B,F)
> +#define columnType(A,B,C,D,E) columnTypeImpl(A,B)
>   #endif
>   static enum field_type
> -columnTypeImpl(NameContext * pNC, Expr * pExpr,
> +columnTypeImpl(NameContext * pNC, Expr * pExpr
>   #ifdef SQLITE_ENABLE_COLUMN_METADATA
> -	       const char **pzOrigTab, const char **pzOrigCol,
> +	       , const char **pzOrigTab, const char **pzOrigCol,

2. As I can see, the third argument is always NULL. Lets remove it
too.
> diff --git a/src/box/sql/sqliteInt.h b/src/box/sql/sqliteInt.h
> index 59662cf14..8ca8e808f 100644
> --- a/src/box/sql/sqliteInt.h
> +++ b/src/box/sql/sqliteInt.h
> @@ -1396,6 +1396,11 @@ struct BusyHandler {
>    */
>   #define IsPowerOfTwo(X) (((X)&((X)-1))==0)
>   
> +#ifdef ZERO_OR_DIV
> +#undef ZERO_OR_DIV
> +#endif
> +#define DIV_OR_ZERO(NUM, DENOM) (((DENOM) != 0) ? ((NUM) / (DENOM)) : 0)

3.

Divide by 0: *exists*
Programmers:https://pm1.narvii.com/6585/b5b717574d0d6250181c18aadd89fbe0b3c7bf3a_hq.jpg

Lets just inline it. And what is ZERO_OR_DIV?
> diff --git a/src/box/sql/where.c b/src/box/sql/where.c
> index 2a2630281..51b53c2df 100644
> --- a/src/box/sql/where.c
> +++ b/src/box/sql/where.c
> @@ -2545,15 +2537,31 @@ whereLoopAddBtreeIndex(WhereLoopBuilder * pBuilder,	/* The WhereLoop factory */
>   		 * seek only. Then, if this is a non-covering index, add the cost of
>   		 * visiting the rows in the main table.
>   		 */
> -		rCostIdx =
> -		    pNew->nOut + 1 +
> -		    (15 * pProbe->szIdxRow) / pSrc->pTab->szTabRow;
> +		struct space *space =
> +			space_by_id(SQLITE_PAGENO_TO_SPACEID(pProbe->tnum));
> +		assert(space != NULL);
> +		struct index *idx =
> +			space_index(space,
> +				    SQLITE_PAGENO_TO_INDEXID(pProbe->tnum));
> +		assert(idx != NULL);
> +		/*
> +		 * FIXME: currently, the procedure below makes no
> +		 * sense, since there are no partial indexes, so
> +		 * all indexes in the space feature the same
> +		 * average tuple size.

4. Do not forget about Vinyl. In it even with no partial indexes different
ones can contain different tuple count, tuples of different size (due to
specific of secondary indexes disk data structure). Now it does not support SQL,
but will do.
> diff --git a/test/sql-tap/analyze9.test.lua b/test/sql-tap/analyze9.test.lua
> index 4ce575e90..3b3d52f67 100755
> --- a/test/sql-tap/analyze9.test.lua
> +++ b/test/sql-tap/analyze9.test.lua
> +-- These tests are commented until query planer will be stable.

5. What do you mean saying 'unstable'? The test is flaky or incorrect?




More information about the Tarantool-patches mailing list