[tarantool-patches] Re: [PATCH 5/9] sql: improve type determination for column meta

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Mon Apr 22 21:02:18 MSK 2019


Thanks for the fixes!

On 18/04/2019 20:54, n.pettik wrote:
> 
> 
>> On 16 Apr 2019, at 17:12, Vladislav Shpilevoy <v.shpilevoy at tarantool.org> wrote:
>>
>> On 14/04/2019 18:04, Nikita Pettik wrote:
>>> In most cases we don't assign and store type of node of expression AST
>>> (except for constant literals).  To determine type of node  we use
>>> sql_expr_type() function, which implements logic of quite simple
>>> recursive tree traversal. Before this patch we set type of node after
>>> code generation in sqlExprCodeTarget() without any traversal. This
>>> approach is way worse even then sql_expr_type().
>>
>> Why? How is recursive scan of the tree better than simple access to
>> a struct's attribute? I see, that your patch improves type determination
>> precision, but I do not understand how, and what a perf price for that?
> 
> Firstly, it seems that sqlExprCodeTarget() can be called on copy of
> original expr (when query contains order/group by clauses): I see
> that types assigned in that function are different from types in expr
> passed to generateColumnNames().

Ok, now I got it - the original expr still is unchanged, and types
resolving in codetarget() is thrown into a bucket.

> 
> Secondly, suggested approach simply makes code cleaner and easier
> to maintain: we shouldn’t care about adding on each branch of switch
> appropriate type (since it is already done in sql_expr_type()).
> 
> Eventually, sqlExprCodeTarget() doesn’t take into consideration
> types of child nodes. This drawback will appear when we extend
> integer type with small/big ints:
> 
> Column a is of type smallint.
> 
> SELECT 1 + a;
> 
> Should result in ‘smallint' type, not common ‘number’.
> Without tree traversal we can’t do that.
> 
> Price is a primitive traversal through the expr tree. This traversal (but in more
> sophisticated implementation and being called only once saving in each node
> its type) is anyway required to implement this task:
> https://github.com/tarantool/tarantool/issues/3103 <https://github.com/tarantool/tarantool/issues/3103>

I do not have anything against traversal. I just don't like multiple
traversal and seemingly free and fast call of deceptive sql_expr_type() -
it is expensive in fact, just look at this huge 'switch-case'.

Your patch is ok, it does not aggravate anything. But sql_expr_type()
is called in very many places. Probably, in scope of 3103 we should
set Expr type always before optimizer and Vdbe-coder are started. Just
thoughts.

> 
> In fact, we should once visit each node (probably after names resolution)
> and save its type depending on child’s types.
> 
>> If you are not sure about the price, we could ask Alexander to run
>> benches on your branch before pushing into the master.
> 
> This patch is auxiliary to main-patchset. I’ve added it to avoid
> abusing changing result files like sql/types.result etc. Can drop it tho.

Now I see, that there is nothing to bench. Almost pure refactoring.

>>> So, to improve accuracy
>>> of type determination, let's always call that method and remove type
>>> assignment in sqlExprCodeTarget().
>>
>> This patch fixes the issue
>> https://github.com/tarantool/tarantool/issues/4126.
> 
> Ok, added label.
> 

During the second review I found some orphan type assignments
in codetarget(). I dropped them and the tests passed. The diff is
on the branch.

==================================================================
diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 016d6f899..0ad4b74f8 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -3725,7 +3725,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 	case TK_AGG_COLUMN:{
 			AggInfo *pAggInfo = pExpr->pAggInfo;
 			struct AggInfo_col *pCol = &pAggInfo->aCol[pExpr->iAgg];
-			pExpr->type = pCol->pExpr->type;
 			if (!pAggInfo->directMode) {
 				assert(pCol->iMem > 0);
 				return pCol->iMem;
@@ -3757,7 +3756,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 					iTab = pParse->iSelfTab;
 				}
 			}
-			pExpr->type = pExpr->space_def->fields[col].type;
 			return sqlExprCodeGetColumn(pParse,
 							pExpr->space_def, col,
 							iTab, target,
@@ -3925,7 +3923,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 		}
 	case TK_UMINUS:{
 			Expr *pLeft = pExpr->pLeft;
-			pExpr->type = FIELD_TYPE_NUMBER;
 			assert(pLeft);
 			if (pLeft->op == TK_INTEGER) {
 				expr_code_int(pParse, pLeft, true, target);
@@ -3989,7 +3986,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 					 tt_sprintf(err, pExpr->u.zToken));
 				pParse->is_aborted = true;
 			} else {
-				pExpr->type = pInfo->aFunc->pFunc->ret_type;
 				return pInfo->aFunc[pExpr->iAgg].iMem;
 			}
 			break;




More information about the Tarantool-patches mailing list