From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id EE1D52B602 for ; Thu, 18 Apr 2019 13:54:50 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wbtoLcdWIhNE for ; Thu, 18 Apr 2019 13:54:50 -0400 (EDT) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id A49782A798 for ; Thu, 18 Apr 2019 13:54:50 -0400 (EDT) From: "n.pettik" Message-Id: <4D657191-4BC6-43F0-97A5-64B784DB0DE8@tarantool.org> Content-Type: multipart/alternative; boundary="Apple-Mail=_F8859CBB-9584-4E22-A980-16F590345838" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: [tarantool-patches] Re: [PATCH 5/9] sql: improve type determination for column meta Date: Thu, 18 Apr 2019 20:54:48 +0300 In-Reply-To: <9edb1536-e63c-d9ab-5130-1f38b822d698@tarantool.org> References: <9edb1536-e63c-d9ab-5130-1f38b822d698@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy --Apple-Mail=_F8859CBB-9584-4E22-A980-16F590345838 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 16 Apr 2019, at 17:12, Vladislav Shpilevoy = wrote: >=20 > On 14/04/2019 18:04, Nikita Pettik wrote: >> In most cases we don't assign and store type of node of expression = AST >> (except for constant literals). To determine type of node we use >> sql_expr_type() function, which implements logic of quite simple >> recursive tree traversal. Before this patch we set type of node after >> code generation in sqlExprCodeTarget() without any traversal. This >> approach is way worse even then sql_expr_type(). >=20 > Why? How is recursive scan of the tree better than simple access to > a struct's attribute? I see, that your patch improves type = determination > precision, but I do not understand how, and what a perf price for = that? Firstly, it seems that sqlExprCodeTarget() can be called on copy of original expr (when query contains order/group by clauses): I see that types assigned in that function are different from types in expr passed to generateColumnNames(). Secondly, suggested approach simply makes code cleaner and easier to maintain: we shouldn=E2=80=99t care about adding on each branch of = switch appropriate type (since it is already done in sql_expr_type()). Eventually, sqlExprCodeTarget() doesn=E2=80=99t take into consideration types of child nodes. This drawback will appear when we extend integer type with small/big ints: Column a is of type smallint. SELECT 1 + a; Should result in =E2=80=98smallint' type, not common =E2=80=98number=E2=80= =99. Without tree traversal we can=E2=80=99t do that. Price is a primitive traversal through the expr tree. This traversal = (but in more sophisticated implementation and being called only once saving in each = node its type) is anyway required to implement this task: https://github.com/tarantool/tarantool/issues/3103 = In fact, we should once visit each node (probably after names = resolution) and save its type depending on child=E2=80=99s types. > If you are not sure about the price, we could ask Alexander to run > benches on your branch before pushing into the master. This patch is auxiliary to main-patchset. I=E2=80=99ve added it to avoid abusing changing result files like sql/types.result etc. Can drop it = tho. >> So, to improve accuracy >> of type determination, let's always call that method and remove type >> assignment in sqlExprCodeTarget(). >=20 > This patch fixes the issue > https://github.com/tarantool/tarantool/issues/4126. Ok, added label. --Apple-Mail=_F8859CBB-9584-4E22-A980-16F590345838 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On 16 Apr 2019, at 17:12, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> wrote:

On 14/04/2019 18:04, Nikita = Pettik wrote:
In most = cases we don't assign and store type of node of expression AST
(except for constant literals).  To determine type of = node  we use
sql_expr_type() function, which = implements logic of quite simple
recursive tree traversal. = Before this patch we set type of node after
code = generation in sqlExprCodeTarget() without any traversal. This
approach is way worse even then sql_expr_type().

Why? How is recursive scan of the = tree better than simple access to
a struct's attribute? I = see, that your patch improves type determination
precision, = but I do not understand how, and what a perf price for that?

Firstly, it seems that sqlExprCodeTarget() can be = called on copy of
original expr (when query contains = order/group by clauses): I see
that types assigned in that = function are different from types in expr
passed to = generateColumnNames().

Secondly, = suggested approach simply makes code cleaner and easier
to = maintain: we shouldn=E2=80=99t care about adding on each branch of = switch
appropriate type (since it is already done in = sql_expr_type()).

Eventually, = sqlExprCodeTarget() doesn=E2=80=99t take into = consideration
types of child nodes. This drawback will appear = when we extend
integer type with small/big ints:

Column a is of type smallint.

SELECT 1 + a;

Should result in =E2=80=98smallint' type, not = common =E2=80=98number=E2=80=99.
Without tree traversal we = can=E2=80=99t do that.

Price = is a primitive traversal through the expr tree. This traversal (but in = more
sophisticated implementation and being called only once = saving in each node
its type) is anyway required to implement = this task:

In fact, we should once visit each node = (probably after names resolution)
and save its type depending = on child=E2=80=99s types.

If you are not sure about the = price, we could ask Alexander to run
benches on your = branch before pushing into the master.

This patch is auxiliary to main-patchset. I=E2=80=99= ve added it to avoid
abusing changing result files like = sql/types.result etc. Can drop it tho.

So, to improve accuracy
of type = determination, let's always call that method and remove type
assignment in sqlExprCodeTarget().

This patch fixes the issue
https://github.com/tarantool/tarantool/issues/4126.

Ok, added = label.

= --Apple-Mail=_F8859CBB-9584-4E22-A980-16F590345838--