From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B13DF293D4 for ; Mon, 22 Apr 2019 14:02:20 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Llz7H7f5s7J5 for ; Mon, 22 Apr 2019 14:02:20 -0400 (EDT) Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 462A9294DB for ; Mon, 22 Apr 2019 14:02:20 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 5/9] sql: improve type determination for column meta References: <9edb1536-e63c-d9ab-5130-1f38b822d698@tarantool.org> <4D657191-4BC6-43F0-97A5-64B784DB0DE8@tarantool.org> From: Vladislav Shpilevoy Message-ID: <72727290-ccad-85b7-69da-32b59be955a6@tarantool.org> Date: Mon, 22 Apr 2019 21:02:18 +0300 MIME-Version: 1.0 In-Reply-To: <4D657191-4BC6-43F0-97A5-64B784DB0DE8@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: "n.pettik" , tarantool-patches@freelists.org Thanks for the fixes! On 18/04/2019 20:54, n.pettik wrote: > > >> On 16 Apr 2019, at 17:12, Vladislav Shpilevoy wrote: >> >> On 14/04/2019 18:04, Nikita Pettik wrote: >>> In most cases we don't assign and store type of node of expression AST >>> (except for constant literals). To determine type of node we use >>> sql_expr_type() function, which implements logic of quite simple >>> recursive tree traversal. Before this patch we set type of node after >>> code generation in sqlExprCodeTarget() without any traversal. This >>> approach is way worse even then sql_expr_type(). >> >> Why? How is recursive scan of the tree better than simple access to >> a struct's attribute? I see, that your patch improves type determination >> precision, but I do not understand how, and what a perf price for that? > > Firstly, it seems that sqlExprCodeTarget() can be called on copy of > original expr (when query contains order/group by clauses): I see > that types assigned in that function are different from types in expr > passed to generateColumnNames(). Ok, now I got it - the original expr still is unchanged, and types resolving in codetarget() is thrown into a bucket. > > Secondly, suggested approach simply makes code cleaner and easier > to maintain: we shouldn’t care about adding on each branch of switch > appropriate type (since it is already done in sql_expr_type()). > > Eventually, sqlExprCodeTarget() doesn’t take into consideration > types of child nodes. This drawback will appear when we extend > integer type with small/big ints: > > Column a is of type smallint. > > SELECT 1 + a; > > Should result in ‘smallint' type, not common ‘number’. > Without tree traversal we can’t do that. > > Price is a primitive traversal through the expr tree. This traversal (but in more > sophisticated implementation and being called only once saving in each node > its type) is anyway required to implement this task: > https://github.com/tarantool/tarantool/issues/3103 I do not have anything against traversal. I just don't like multiple traversal and seemingly free and fast call of deceptive sql_expr_type() - it is expensive in fact, just look at this huge 'switch-case'. Your patch is ok, it does not aggravate anything. But sql_expr_type() is called in very many places. Probably, in scope of 3103 we should set Expr type always before optimizer and Vdbe-coder are started. Just thoughts. > > In fact, we should once visit each node (probably after names resolution) > and save its type depending on child’s types. > >> If you are not sure about the price, we could ask Alexander to run >> benches on your branch before pushing into the master. > > This patch is auxiliary to main-patchset. I’ve added it to avoid > abusing changing result files like sql/types.result etc. Can drop it tho. Now I see, that there is nothing to bench. Almost pure refactoring. >>> So, to improve accuracy >>> of type determination, let's always call that method and remove type >>> assignment in sqlExprCodeTarget(). >> >> This patch fixes the issue >> https://github.com/tarantool/tarantool/issues/4126. > > Ok, added label. > During the second review I found some orphan type assignments in codetarget(). I dropped them and the tests passed. The diff is on the branch. ================================================================== diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c index 016d6f899..0ad4b74f8 100644 --- a/src/box/sql/expr.c +++ b/src/box/sql/expr.c @@ -3725,7 +3725,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target) case TK_AGG_COLUMN:{ AggInfo *pAggInfo = pExpr->pAggInfo; struct AggInfo_col *pCol = &pAggInfo->aCol[pExpr->iAgg]; - pExpr->type = pCol->pExpr->type; if (!pAggInfo->directMode) { assert(pCol->iMem > 0); return pCol->iMem; @@ -3757,7 +3756,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target) iTab = pParse->iSelfTab; } } - pExpr->type = pExpr->space_def->fields[col].type; return sqlExprCodeGetColumn(pParse, pExpr->space_def, col, iTab, target, @@ -3925,7 +3923,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target) } case TK_UMINUS:{ Expr *pLeft = pExpr->pLeft; - pExpr->type = FIELD_TYPE_NUMBER; assert(pLeft); if (pLeft->op == TK_INTEGER) { expr_code_int(pParse, pLeft, true, target); @@ -3989,7 +3986,6 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target) tt_sprintf(err, pExpr->u.zToken)); pParse->is_aborted = true; } else { - pExpr->type = pInfo->aFunc->pFunc->ret_type; return pInfo->aFunc[pExpr->iAgg].iMem; } break;