[tarantool-patches] Re: [PATCH 2/2] sql: compute resulting collation for concatenation

n.pettik korablev at tarantool.org
Thu Jan 17 22:19:26 MSK 2019

> On 17 Jan 2019, at 16:33, Konstantin Osipov <kostja at tarantool.org> wrote:
> * Nikita Pettik <korablev at tarantool.org> [19/01/16 17:06]:
>> According to ANSI, result of concatenation operation should derive
>> collation sequence from its operands. Now it is not true: result is
>> always comes with no ("none") collation.
> Generally, it should be very cheap to introduce expression static
> analysis phase by adding static analysis state to struct Expr.
> Yes, it's a blasphemy from separation of concerns point of view
> but it seems to be a lesser evil than invoking partial static
> analysis here and there during code generation.
> What i mean is that instead of changing signature of
> sql_expr_coll() one should be able to do:
> /**
>  * Fills expr->coll for every node in the expression tree or
>  * returns an appropriate error if there is a type error.

*Type and collation analysis are slightly different things I guess.*

>  */
> int
> sql_expr_static_analysis(struct Expr *expr);

Implement decent static analysis pass could be not
so easy as it seems to be. Firstly, I guess we should
deal with https://github.com/tarantool/tarantool/issues/3861
In a nutshell, now walker uses recursive approach.
Hence, due to very limited size of fiber’s stack we get
stack overflow on not so giant expressions.
One more recursive routine may make things even worse.

Secondly, we should decide where to place this analysis.
There is no one entry point before code generation
as well as there is no strict separation between parsing
and code generation. Moreover, complete AST may not be
constructed at all. I’m simply inlining part of R. Hipp’s recent
respond from SQLite maling list:

> SQLite does not necessarily generate a complete AST, then hand
> that AST off to a code generator for evaluation, all in one neat step.
> Depending on the SQL statement, the byte code may be generated using
> techniques reminiscent of "syntax directed translation".  Control hops
> back and forth between parsing and code generation.  Sections of
> bytecode might generated at multiple reduce actions within the parse.

In case you bless me and allow to spend extra time I can attempt at
investigating this issue.

Current approach at least works now.

More information about the Tarantool-patches mailing list