Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Nikita Pettik <korablev@tarantool.org>, tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH 3/4] sql: refactor usages of table's tuple count
Date: Tue, 24 Apr 2018 15:51:53 +0300	[thread overview]
Message-ID: <bf2cd090-f0f6-1d4a-4de9-7845d09d1b48@tarantool.org> (raw)
In-Reply-To: <7c93d6e8d43dbf9c912daca4ff1762d1fe412b35.1524515002.git.korablev@tarantool.org>

See my 3 comments below, thanks.

On 23/04/2018 23:29, Nikita Pettik wrote:
> Count of tuples containing in given space can be calculated as a number of
> tuples in primary index. However, if table represents temporary object
> such as result set of SELECT or VIEW, tuple count can't be precisely
> determined. In this case, default approximation is used: 1 million tuples.
> 
> Part of #3253
> diff --git a/src/box/sql/sqliteInt.h b/src/box/sql/sqliteInt.h
> index 8ca8e808f..d026a213d 100644
> --- a/src/box/sql/sqliteInt.h
> +++ b/src/box/sql/sqliteInt.h
> @@ -2164,7 +2180,6 @@ struct Index {
>   	tRowcnt *aAvgEq;	/* Average nEq values for keys not in aSample */
>   	IndexSample *aSample;	/* Samples of the left-most key */
>   	tRowcnt *aiRowEst;	/* Non-logarithmic stat1 data for this index */
> -	tRowcnt nRowEst0;	/* Non-logarithmic number of rows in the index */
>   };
>   
>   /*
> @@ -2199,6 +2214,16 @@ struct IndexSample {
>   	tRowcnt *anDLt;		/* Est. number of distinct keys less than this sample */
>   };
>   
> +#ifdef DEFAULT_TUPLE_COUNT
> +#undef DEFAULT_TUPLE_COUNT

1. Why do you need these guards? sqliteInt.h already has #ifndef SQLITEINT_H
#define SQLITEINT_H guard. If it does not help, then it is error and must be
fixed with no ifdef + undef for each definition.

> +#endif
> +#define DEFAULT_TUPLE_COUNT 1048576
> +
> +#ifdef DEFAULT_TUPLE_LOG_COUNT
> +#undef DEFAULT_TUPLE_LOG_COUNT
> +#endif
> +#define DEFAULT_TUPLE_LOG_COUNT sqlite3LogEst(DEFAULT_TUPLE_COUNT)

2. In the original code it was 200 to avoid calculation of
sqlite3LogEst(1048576) on each temporary table creation. Lets do the same.

> diff --git a/src/box/sql/where.c b/src/box/sql/where.c
> index 51b53c2df..e6f9ef431 100644
> --- a/src/box/sql/where.c
> +++ b/src/box/sql/where.c
> @@ -1326,8 +1326,22 @@ whereRangeScanEst(Parse * pParse,	/* Parsing & code generating context */
>   			}
>   			/* Determine iLower and iUpper using ($P) only. */
>   			if (nEq == 0) {
> +				/*
> +				 * In this simple case, there are no any
> +				 * equality constraints, so initially all rows
> +				 * are in range.
> +				 */
>   				iLower = 0;
> -				iUpper = p->nRowEst0;
> +				uint32_t space_id =
> +					SQLITE_PAGENO_TO_SPACEID(p->tnum);
> +				struct space *space = space_by_id(space_id);
> +				assert(space != NULL);
> +				uint32_t iid =
> +					SQLITE_PAGENO_TO_INDEXID(p->tnum);
> +				struct index *idx =
> +					space_index(space, iid);
> +				assert(idx != NULL);
> +				iUpper = idx->vtab->size(idx);

3. Lets use index_count. And looks like this case is repeated not the first and
even not the second time. Lets implement a function sql_table_size(), that takes
a table pointer, and does PAGENO_TO... + space_by_id + space_index + index_size.
I would put it in one of the previous patches, where this case appears first.

>   			} else {
>   				/* Note: this call could be optimized away - since the same values must
>   				 * have been requested when testing key $P in whereEqualScanEst().
> @@ -2786,7 +2800,7 @@ whereLoopAddBtree(WhereLoopBuilder * pBuilder,	/* WHERE clause information */
>   		sPk.aiRowLogEst = aiRowEstPk;
>   		sPk.onError = ON_CONFLICT_ACTION_REPLACE;
>   		sPk.pTable = pTab;
> -		aiRowEstPk[0] = pTab->nRowLogEst;
> +		aiRowEstPk[0] = sql_space_tuple_log_count(pTab);
>   		aiRowEstPk[1] = 0;
>   		pFirst = pSrc->pTab->pIndex;
>   		if (pSrc->fg.notIndexed == 0) {
> 

  reply	other threads:[~2018-04-24 12:51 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-23 20:29 [tarantool-patches] [PATCH 0/4] Move original SQLite's statistics to server Nikita Pettik
2018-04-23 20:29 ` [tarantool-patches] [PATCH 1/4] sql: optimize compilation of SELECT COUNT(*) Nikita Pettik
2018-04-24 12:51   ` [tarantool-patches] " Vladislav Shpilevoy
2018-05-11 17:29     ` n.pettik
2018-04-23 20:29 ` [tarantool-patches] [PATCH 2/4] sql: add average tuple size calculation Nikita Pettik
2018-04-24 12:51   ` [tarantool-patches] " Vladislav Shpilevoy
2018-05-11 17:29     ` n.pettik
2018-04-23 20:29 ` [tarantool-patches] [PATCH 3/4] sql: refactor usages of table's tuple count Nikita Pettik
2018-04-24 12:51   ` Vladislav Shpilevoy [this message]
2018-05-11 17:29     ` [tarantool-patches] " n.pettik
2018-04-23 20:29 ` [tarantool-patches] [PATCH 4/4] sql: move SQL statistics to server Nikita Pettik
2018-04-24 12:51   ` [tarantool-patches] " Vladislav Shpilevoy
2018-05-11 17:29     ` n.pettik
2018-05-11 22:00       ` Vladislav Shpilevoy
2018-05-14 11:52         ` n.pettik
2018-05-14 12:54           ` Vladislav Shpilevoy
2018-05-14 13:55             ` n.pettik
2018-05-14 14:12 ` [tarantool-patches] Re: [PATCH 0/4] Move original SQLite's " Vladislav Shpilevoy
2018-05-15 13:42   ` Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf2cd090-f0f6-1d4a-4de9-7845d09d1b48@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=korablev@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH 3/4] sql: refactor usages of table'\''s tuple count' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox