From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 3A9282520D for ; Fri, 11 May 2018 13:29:34 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hyngvqttBEA3 for ; Fri, 11 May 2018 13:29:34 -0400 (EDT) Received: from smtp50.i.mail.ru (smtp50.i.mail.ru [94.100.177.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id A014A25207 for ; Fri, 11 May 2018 13:29:33 -0400 (EDT) From: n.pettik Message-Id: <16ADACB2-9CCC-4282-9533-AB1FA6A5E770@tarantool.org> Content-Type: multipart/alternative; boundary="Apple-Mail=_348D6DBC-A0C0-4129-86FC-EA073184E54C" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: [tarantool-patches] Re: [PATCH 2/4] sql: add average tuple size calculation Date: Fri, 11 May 2018 20:29:31 +0300 In-Reply-To: <46e6a842-2a0b-90fc-7da0-2268eb6d05e3@tarantool.org> References: <1483b8989b9065e4353a49fa236ea6acc8fbed93.1524515002.git.korablev@tarantool.org> <46e6a842-2a0b-90fc-7da0-2268eb6d05e3@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy --Apple-Mail=_348D6DBC-A0C0-4129-86FC-EA073184E54C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii >> +ssize_t >> +sql_index_tuple_size(struct space *space, struct index *idx) >> +{ >> + assert(space !=3D NULL); >> + assert(idx !=3D NULL); >> + assert(idx->def->space_id =3D=3D space->def->id); >> + ssize_t tuple_count =3D idx->vtab->size(idx); >> + ssize_t space_size =3D space->vtab->bsize(space); >=20 > 1. Lets use wrappers: index_size() and space_bsize() - they are = defined > already. Done: +++ b/src/box/sql/analyze.c @@ -1197,9 +1197,10 @@ sql_index_tuple_size(struct space *space, struct = index *idx) assert(space !=3D NULL); assert(idx !=3D NULL); assert(idx->def->space_id =3D=3D space->def->id); - ssize_t tuple_count =3D idx->vtab->size(idx); - ssize_t space_size =3D space->vtab->bsize(space); - ssize_t avg_tuple_size =3D DIV_OR_ZERO(space_size, tuple_count); + ssize_t tuple_count =3D index_size(idx); + ssize_t space_size =3D space_bsize(space); + ssize_t avg_tuple_size =3D tuple_count !=3D 0 ? + (space_size / tuple_count) : 0; >> diff --git a/src/box/sql/select.c b/src/box/sql/select.c >> index 0df8a71d4..391b7e0a2 100644 >> --- a/src/box/sql/select.c >> +++ b/src/box/sql/select.c >> @@ -1588,20 +1588,19 @@ generateSortTail(Parse * pParse, /* = Parsing context */ >> * the SQLITE_ENABLE_COLUMN_METADATA compile-time option is used. >> */ >> #ifdef SQLITE_ENABLE_COLUMN_METADATA >> -#define columnType(A,B,C,D,E,F) columnTypeImpl(A,B,D,E,F) >> +#define columnType(A,B,C,D,E) columnTypeImpl(A,B,D,E) >> #else /* if = !defined(SQLITE_ENABLE_COLUMN_METADATA) */ >> -#define columnType(A,B,C,D,E,F) columnTypeImpl(A,B,F) >> +#define columnType(A,B,C,D,E) columnTypeImpl(A,B) >> #endif >> static enum field_type >> -columnTypeImpl(NameContext * pNC, Expr * pExpr, >> +columnTypeImpl(NameContext * pNC, Expr * pExpr >> #ifdef SQLITE_ENABLE_COLUMN_METADATA >> - const char **pzOrigTab, const char **pzOrigCol, >> + , const char **pzOrigTab, const char **pzOrigCol, >=20 > 2. As I can see, the third argument is always NULL. Lets remove it > too. Done: @@ -1655,8 +1655,8 @@ columnTypeImpl(NameContext * pNC, Expr * pExpr sNC.pNext =3D pNC; sNC.pParse =3D pNC->pParse; column_type =3D - columnType(&sNC, p, 0, - &zOrigTab, = &zOrigCol); + columnType(&sNC, p, = &zOrigTab, + &zOrigCol); @@ -1685,7 +1685,7 @@ columnTypeImpl(NameContext * pNC, Expr * pExpr sNC.pNext =3D pNC; sNC.pParse =3D pNC->pParse; column_type =3D - columnType(&sNC, p, 0, &zOrigTab, = &zOrigCol); + columnType(&sNC, p, &zOrigTab, &zOrigCol); @@ -1921,7 +1921,7 @@ sqlite3SelectAddColumnTypeAndCollation(Parse * = pParse, /* Parsing contexts */ for (i =3D 0, pCol =3D pTab->aCol; i < pTab->nCol; i++, pCol++) = { enum field_type type; p =3D a[i].pExpr; - type =3D columnType(&sNC, p, 0, 0, 0); + type =3D columnType(&sNC, p, 0, 0); >> diff --git a/src/box/sql/sqliteInt.h b/src/box/sql/sqliteInt.h >> index 59662cf14..8ca8e808f 100644 >> --- a/src/box/sql/sqliteInt.h >> +++ b/src/box/sql/sqliteInt.h >> @@ -1396,6 +1396,11 @@ struct BusyHandler { >> */ >> #define IsPowerOfTwo(X) (((X)&((X)-1))=3D=3D0) >> +#ifdef ZERO_OR_DIV >> +#undef ZERO_OR_DIV >> +#endif >> +#define DIV_OR_ZERO(NUM, DENOM) (((DENOM) !=3D 0) ? ((NUM) / = (DENOM)) : 0) >=20 > 3. >=20 > Divide by 0: *exists* > = Programmers:https://pm1.narvii.com/6585/b5b717574d0d6250181c18aadd89fbe0b3= c7bf3a_hq.jpg = >=20 > Lets just inline it. And what is ZERO_OR_DIV? It was just typo in naming. Anyway, I have removed this macro: +++ b/src/box/sql/sqliteInt.h @@ -1389,11 +1389,6 @@ struct BusyHandler { */ #define IsPowerOfTwo(X) (((X)&((X)-1))=3D=3D0) =20 -#ifdef ZERO_OR_DIV -#undef ZERO_OR_DIV -#endif -#define DIV_OR_ZERO(NUM, DENOM) (((DENOM) !=3D 0) ? ((NUM) / (DENOM)) : = 0) - >> diff --git a/src/box/sql/where.c b/src/box/sql/where.c >> index 2a2630281..51b53c2df 100644 >> --- a/src/box/sql/where.c >> +++ b/src/box/sql/where.c >> @@ -2545,15 +2537,31 @@ whereLoopAddBtreeIndex(WhereLoopBuilder * = pBuilder, /* The WhereLoop factory */ >> * seek only. Then, if this is a non-covering index, add = the cost of >> * visiting the rows in the main table. >> */ >> - rCostIdx =3D >> - pNew->nOut + 1 + >> - (15 * pProbe->szIdxRow) / pSrc->pTab->szTabRow; >> + struct space *space =3D >> + = space_by_id(SQLITE_PAGENO_TO_SPACEID(pProbe->tnum)); >> + assert(space !=3D NULL); >> + struct index *idx =3D >> + space_index(space, >> + = SQLITE_PAGENO_TO_INDEXID(pProbe->tnum)); >> + assert(idx !=3D NULL); >> + /* >> + * FIXME: currently, the procedure below makes no >> + * sense, since there are no partial indexes, so >> + * all indexes in the space feature the same >> + * average tuple size. >=20 > 4. Do not forget about Vinyl. In it even with no partial indexes = different > ones can contain different tuple count, tuples of different size (due = to > specific of secondary indexes disk data structure). Now it does not = support SQL, > but will do. Ok, updated comment (and inlined macro): --- a/src/box/sql/where.c +++ b/src/box/sql/where.c @@ -2544,14 +2544,17 @@ whereLoopAddBtreeIndex(WhereLoopBuilder * = pBuilder, /* The WhereLoop factory */ * FIXME: currently, the procedure below makes no * sense, since there are no partial indexes, so * all indexes in the space feature the same - * average tuple size. + * average tuple size. Moreover, secondary + * indexes in Vinyl engine may contain different + * tuple count of different sizes. */ ssize_t avg_tuple_size =3D sql_index_tuple_size(space, = idx); struct index *pk =3D space_index(space, 0); assert(pProbe->pTable =3D=3D pSrc->pTab); ssize_t avg_tuple_size_pk =3D = sql_index_tuple_size(space, pk); - uint32_t partial_index_cost =3D DIV_OR_ZERO((15 * = avg_tuple_size), - = avg_tuple_size_pk); + uint32_t partial_index_cost =3D + avg_tuple_size_pk !=3D 0 ? + (15 * avg_tuple_size) / avg_tuple_size_pk : 0; >> diff --git a/test/sql-tap/analyze9.test.lua = b/test/sql-tap/analyze9.test.lua >> index 4ce575e90..3b3d52f67 100755 >> --- a/test/sql-tap/analyze9.test.lua >> +++ b/test/sql-tap/analyze9.test.lua >> +-- These tests are commented until query planer will be stable. >=20 > 5. What do you mean saying 'unstable'? The test is flaky or incorrect? Both. --Apple-Mail=_348D6DBC-A0C0-4129-86FC-EA073184E54C Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii

 +ssize_t
+sql_index_tuple_size(struct space *space, struct index = *idx)
+{
+ assert(space !=3D NULL);
+ = assert(idx !=3D NULL);
+ = assert(idx->def->space_id =3D=3D space->def->id);
+ = ssize_t tuple_count =3D idx->vtab->size(idx);
+ = ssize_t space_size =3D space->vtab->bsize(space);

1. Lets use wrappers: index_size() and = space_bsize() - they are defined
already.

Done:

+++ = b/src/box/sql/analyze.c
@@ -1197,9 +1197,10 @@ = sql_index_tuple_size(struct space *space, struct index = *idx)
        assert(space !=3D = NULL);
        assert(idx !=3D = NULL);
        = assert(idx->def->space_id =3D=3D = space->def->id);
-       ssize_t = tuple_count =3D idx->vtab->size(idx);
-     =   ssize_t space_size =3D = space->vtab->bsize(space);
-       = ssize_t avg_tuple_size =3D DIV_OR_ZERO(space_size, = tuple_count);
+       ssize_t tuple_count =3D = index_size(idx);
+       ssize_t space_size =3D = space_bsize(space);
+       ssize_t = avg_tuple_size =3D tuple_count !=3D 0 ?
+       =                     =      (space_size / tuple_count) : 0;

diff --git = a/src/box/sql/select.c b/src/box/sql/select.c
index = 0df8a71d4..391b7e0a2 100644
--- a/src/box/sql/select.c
+++ b/src/box/sql/select.c
@@ -1588,20 +1588,19 = @@ generateSortTail(Parse * pParse, /* Parsing context */
  * the SQLITE_ENABLE_COLUMN_METADATA compile-time = option is used.
  */
 #ifdef = SQLITE_ENABLE_COLUMN_METADATA
-#define = columnType(A,B,C,D,E,F) columnTypeImpl(A,B,D,E,F)
+#define = columnType(A,B,C,D,E) columnTypeImpl(A,B,D,E)
 #else /* if !defined(SQLITE_ENABLE_COLUMN_METADATA) */
-#define columnType(A,B,C,D,E,F) columnTypeImpl(A,B,F)
+#define columnType(A,B,C,D,E) columnTypeImpl(A,B)
 #endif
 static enum field_type
-columnTypeImpl(NameContext * pNC, Expr * pExpr,
+columnTypeImpl(NameContext * pNC, Expr * pExpr
 #ifdef SQLITE_ENABLE_COLUMN_METADATA
-       = ; const char **pzOrigTab, const char **pzOrigCol,
+ =       = ; , const char **pzOrigTab, const char **pzOrigCol,

2. As I can see, the third argument is = always NULL. Lets remove it
too.

Done:

@@ = -1655,8 +1655,8 @@ columnTypeImpl(NameContext * pNC, Expr * = pExpr
                =                     =     sNC.pNext =3D pNC;
        =                     =             sNC.pParse =3D = pNC->pParse;
            =                     =         column_type =3D
-     =                     =                   = columnType(&sNC, p, 0,
-         =                     =                     =      &zOrigTab, &zOrigCol);
+   =                     =                     = columnType(&sNC, p, &zOrigTab,
+       =                     =                     =        &zOrigCol);

@@ -1685,7 +1685,7 @@ = columnTypeImpl(NameContext * pNC, Expr * pExpr
    =                     = sNC.pNext =3D pNC;
            =             sNC.pParse =3D = pNC->pParse;
            =             column_type =3D
- =                     =       columnType(&sNC, p, 0, &zOrigTab, = &zOrigCol);
+             =               columnType(&sNC, p, = &zOrigTab, &zOrigCol);

@@= -1921,7 +1921,7 @@ sqlite3SelectAddColumnTypeAndCollation(Parse * = pParse,            /* Parsing contexts = */
        for (i =3D 0, pCol =3D = pTab->aCol; i < pTab->nCol; i++, pCol++) {
  =               enum field_type = type;
                = p =3D a[i].pExpr;
-             =   type =3D columnType(&sNC, p, 0, 0, 0);
+   =             type =3D columnType(&sNC, = p, 0, 0);

diff --git = a/src/box/sql/sqliteInt.h b/src/box/sql/sqliteInt.h
index = 59662cf14..8ca8e808f 100644
--- = a/src/box/sql/sqliteInt.h
+++ b/src/box/sql/sqliteInt.h
@@ -1396,6 +1396,11 @@ struct BusyHandler {
  */
 #define IsPowerOfTwo(X) = (((X)&((X)-1))=3D=3D0)
 +#ifdef ZERO_OR_DIV
+#undef ZERO_OR_DIV
+#endif
+#define DIV_OR_ZERO(NUM, DENOM) (((DENOM) !=3D 0) ? ((NUM) / = (DENOM)) : 0)

3.

Divide by 0: *exists*
Programmers:https://pm1.narvii.com/6585/b5b717574d0d6250181c18aadd89fbe0b3c= 7bf3a_hq.jpg

Lets just inline it. And what is = ZERO_OR_DIV?

It was just typo in naming. Anyway, I have removed = this macro:

+++ = b/src/box/sql/sqliteInt.h
@@ -1389,11 +1389,6 @@ struct = BusyHandler {
  */
 #define = IsPowerOfTwo(X) = (((X)&((X)-1))=3D=3D0)
 
-#ifdef = ZERO_OR_DIV
-#undef = ZERO_OR_DIV
-#endif
-#define DIV_OR_ZERO(NUM, DENOM) = (((DENOM) !=3D 0) ? ((NUM) / (DENOM)) : 0)
-

diff --git = a/src/box/sql/where.c b/src/box/sql/where.c
index = 2a2630281..51b53c2df 100644
--- a/src/box/sql/where.c
+++ b/src/box/sql/where.c
@@ -2545,15 +2537,31 = @@ whereLoopAddBtreeIndex(WhereLoopBuilder * pBuilder, /* The = WhereLoop factory */
   * seek only. Then, if this = is a non-covering index, add the cost of
   * visiting the rows in the = main table.
   */
- rCostIdx = =3D
-     pNew->nO= ut + 1 +
-     (15 * = pProbe->szIdxRow) / pSrc->pTab->szTabRow;
+ struct = space *space =3D
+ = space_by_id(SQLITE_PAGENO_TO_SPACEID(pProbe->tnum));
+ = = assert(space !=3D NULL);
+ struct = index *idx =3D
+ space_index(space,
+ = = = =     SQLITE_PAGE= NO_TO_INDEXID(pProbe->tnum));
+ = assert(idx !=3D NULL);
+ /*
+ = =  * FIXME: = currently, the procedure below makes no
+  * sense, since there are no = partial indexes, so
+  * all indexes in the space = feature the same
+  * average tuple size.

4. Do not forget about Vinyl. In it even = with no partial indexes different
ones can contain different tuple = count, tuples of different size (due to
specific of secondary indexes = disk data structure). Now it does not support SQL,
but will do.

Ok, updated = comment (and inlined macro):

--- a/src/box/sql/where.c
+++ = b/src/box/sql/where.c
@@ -2544,14 +2544,17 @@ = whereLoopAddBtreeIndex(WhereLoopBuilder * pBuilder,     /* The = WhereLoop factory */
            =      * FIXME: currently, the procedure below makes = no
                =  * sense, since there are no partial indexes, so
  =                * all indexes in = the space feature the same
-         =        * average tuple size.
+   =              * average tuple size. = Moreover, secondary
+           =      * indexes in Vinyl engine may contain = different
+               =  * tuple count of different sizes.
      =            */
    =             ssize_t avg_tuple_size =3D = sql_index_tuple_size(space, idx);
        =         struct index *pk =3D space_index(space, = 0);
                = assert(pProbe->pTable =3D=3D pSrc->pTab);
    =             ssize_t avg_tuple_size_pk =3D = sql_index_tuple_size(space, pk);
-         =       uint32_t partial_index_cost =3D DIV_OR_ZERO((15 * = avg_tuple_size),
-             =                     =                     =     avg_tuple_size_pk);
+       =         uint32_t partial_index_cost =3D
+ =                     =   avg_tuple_size_pk !=3D 0 ?
+       =                 (15 * = avg_tuple_size) / avg_tuple_size_pk : 0;

diff --git = a/test/sql-tap/analyze9.test.lua b/test/sql-tap/analyze9.test.lua
index 4ce575e90..3b3d52f67 100755
--- = a/test/sql-tap/analyze9.test.lua
+++ = b/test/sql-tap/analyze9.test.lua
+-- These tests are = commented until query planer will be stable.

5. What do you mean saying 'unstable'? = The test is flaky or incorrect?

Both.

= --Apple-Mail=_348D6DBC-A0C0-4129-86FC-EA073184E54C--