[Tarantool-discussions] RFC - distributed SQL, step #1 - AST

Timur Safin tsafin at tarantool.org
Thu Nov 26 10:45:12 MSK 2020



: From: Nikita Pettik <korablev at tarantool.org>
: Subject: Re: [Tarantool-discussions] RFC - distributed SQL, step #1 - AST
: 
: On 25 Nov 01:06, Timur Safin wrote:
: >
: > Exporting AST as cdata is easy to do, but in this case it will be
: > inconvenient to manipulate AST nodes, before sending to cluster nodes.
: > So there is 2nd, more idiomatic approach suggested - to convert AST
: > to nested Lua object/tables.
: 
: Hm, why do you need those tables? Serialization into msgpack can
: be done inside SQL internals.

I do understand that direct conversion to msgpack from ast, and creation
of tables instead of msgpack, for manipulations in Lua would have the same
complexity and mostly would be done using the same walker code. The problem
I foresee - it would be hard to massage AST in Lua iff we would need to.
And we would have to do it eventually, once we approach distributed SQL router
task which might need to modify original AST before sending to data nodes
for their local execution. Do you remember how Mike had to process AST cdata
for special aggregation functions processing, and modification of column list?

That's why I assumed that exposing AST nodes as Lua tables would simplify such
modification task, I guessed it would be more idiomatic for Lua. But I might
be wrong here...

: 
: > Which should simplify some massaging
: > and provide natural way to serialization to msgpack.
: >
: >
: > SYNOPSIS
: > ~~~~~~~~
: >
: > .. code-block:: lua
: >
: >    local sql = require `sqlparser`
: >
: >    -- parse, return ast, pass it back unmodified for execution
: >
: >    local ast = sql.parse [[ select * from "table" where id > 10 limit 10
: ]]
: 
: Should this be public API? Alternatively, we can hide it in sql.internals.

I consider sqlparser the internal API already (i.e. box.sql instead of 
currently used temporary sqlparser). So having yet another internal api 
would not make much sense. But it's all up to our decision. 

: 
: >    assert(type(ast) == 'cdata')
: >    local ok = sql.execute(ast)
: >
: >    -- free allocated memory, like box.unprepare but for ast
: >    sql.unparse(ast)
: 
: I don't like unparse name. In fact even unprepare is a bad name,
: it should be called sort of deallocate. I suggest sql.release_ast()/
: sql.free_ast() naming.

I'm ok with unprepare :) [we used this name in the InterSystems for similar 
Contexts] and thus ok with unparse. But seimilarly I'm ok with any different 
name - the major point here, there should be anything which frees AST data
structures kept elsewhere.

: 
: >    -- raw access to cdata structures
: >    local cdata = ast.as_cdata()
: >    if cdata.ast_type == ffi.C.AST_TYPE_SELECT
: >       handle_select_stmt(cdata.select)
: >    end
: >
: >    -- Lua access to structurs as Lua tables
: >    local tree = ast.as_table()
: >    if tree.type == AST_TYPE_SELECT
: >       handle_columns_list(tree.select.columns)
: >       handle_where_clause(tree.select.where)
: >       limit = tree.select.limit
: 
: What's the purpose of these handles?

Sorry for the confusion created. Those **handle_anything** assumed 
to represent any user-defined function which manipulate with cdata 
exported. i.e. I should put it the way:

  user_handle_column_list(tree.select.columns)
  user_handle_where_clause(tree.select.where)

and here we define only tree subobjects exposed to cdata, and don't
care how and where user defined their functions traversing/manipulating 
AST.


: 
: >    end
: >    -- massaging with tree data
: >
: >    -- serialization
: >    local msgpack = require 'msgpack'
: >    local to_wire = msgpack.encode(tree)
: >
: >    -- networking magics ...
: >    -- ... deserialization
: >    local table = msgpack.decode(from_wire)
: >    ast.set_tree(tree)
: >    sql.execute(ast)
: >
: >
: > Regards,
: > Timur
: >
: > P.S.
: >



More information about the Tarantool-discussions mailing list