[Tarantool-discussions] Reusable SQL parser

Alexander Turenko alexander.turenko at tarantool.org
Sun Nov 24 07:07:10 MSK 2019


Discussions about ways to reuse Tarantool SQL parser in applications and
connectors arise from time to time. I want to share a very high-level proposal
how we can reuse the parser in an application and provide some examples we
heard from users.

Proposal
--------

Proposed to extract or expose the SQL parser from Tarantool to provide ability
to generate AST accessible from Lua in an application and make transformations
on the AST.

The AST can be evaluated lazily: when a node is accessed from a Lua land, then
a Lua object (cdata or userdata) is created.

Proposed to provide an ability to generate an SQL request back from this AST
rather then allow to execute an AST. This way looks more debuggable, while
performance considerations may be eliminated using prepared statements.

NB: Implementation of a Lua API to retrieve AST looks as the good first step
to draw a line between the parser and the executor and so looks as a step to
extract the parser into its own library in the future (to use in connectors).

Let's consider a few examples.

Transparent versioning
----------------------

Example of transformation that a user may want to implement is transparently
select a last version of an object. Let's define a table and fill it with
versioned data.

 | CREATE TABLE objects(rowid INTEGER PRIMARY KEY AUTOINCREMENT, id INTEGER, \
 |                      version INTEGER, value VARCHAR(100))
 | INSERT INTO objects VALUES (null, 1, 1, 'record 1 version 1')
 | INSERT INTO objects VALUES (null, 1, 2, 'record 1 version 2')
 | INSERT INTO objects VALUES (null, 1, 3, 'record 1 version 3')
 | INSERT INTO objects VALUES (null, 2, 1, 'record 2 version 1')
 | INSERT INTO objects VALUES (null, 2, 2, 'record 2 version 2')
 | INSERT INTO objects VALUES (null, 3, 2, 'record 3 version 1')

Then, when a user of an app want to select data from this table:

 | SELECT id, value FROM objects

we transparently apply a transformation to retrieve only last version of each
object:

 | SELECT o1.id, o1.value FROM objects o1 LEFT JOIN objects o2 \
 | on (o1.id = o2.id and o1.version < o2.version) WHERE o2.version IS NULL

Based on this SO answer: https://stackoverflow.com/a/1313293/1598057

Transparent sharding
--------------------

We may transparently split an SQL request on a sharding router to several
requests: a part will be executed on the router itself, other parts on certain
storages.

I have no certain example now, so consider this as a kind of second-tier goal
rather then motivating example.

WBR, Alexander Turenko.


More information about the Tarantool-discussions mailing list