[tarantool-patches] Re: [PATCH 0/8] rfc: introduce dry-run execution of SQL queries

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Aug 29 23:46:19 MSK 2019


Hi! Thanks for the patchset!

On 27/08/2019 15:34, Nikita Pettik wrote:
> Branch: https://github.com/tarantool/tarantool/tree/np/sql-dry-run-rfc
> Issue: https://github.com/tarantool/tarantool/issues/3292
> 
> In a nutshell dry-run is a way to get auxiliary (meta-) information of query
> execution without direct execution itself. For instance, every DQL query
> returns column's name and type alongside with tuples forming the resulting set.
> Meta-information can be extended with column's nullability, collation and other
> properties. This feature is required to develop SQL drivers. 
> 
> A bit of details.
> 
> There are three forms of dumped msgpack structure (and corresponding IPROTO
> response format). First one relates to DQL (i.e. SELECT) queries:
> METADATA : [{name, type}, {name, type}, ...], DATA : [tuple, tuple ...]
> 
> In case of dry-run, only metadata should be dumped. As a consequence, DATA
> section becomes optional and depends on port settings (see new 'meta_only'
> option in struct port_sql). To pass flag indicating dry-run execution,
> :execute() method has been allowed to accept 'options' argument. Options can
> form a map of named parameters ({'dry_run' = true}) or simply a list of
> unnamed ordered values. Options are encoded with IPROTO_OPTIONS key. The only
> available option now is 'dry_run'.

I see that you made the API different for netbox and box - it should not be
so. We either do options for both netbox and box, or a separate method for both.

What to choose - I don't know. Kostja contradicts himself in his email, where he
said that you can use 'prepare' and return a handle, but on the other hand he
proposed to auto-cache all prepared statements. I don't think that auto-cache is
a good idea. It is hard to support, hard to prevent eviction when necessary, etc.
Kirll Y. also sticks to explicit prepare as I know.

On a chosen option it depends whether we need to use 'prepare' and return a handle
with metadata, or we need to add an option, and return the meta without a handle.

> To execute query in dry-run mode locally separate method of box has been added:
> box.dry_run("sql_string"). Current implementation does not modify box.execute()
> making it accept third optional argument (except for array of bindings) - this
> is to be discussed. 
> 
> Other open (or dubious) questions.
> 
> 1. How to handle DML queries. Meanwhile getting meta-information of DDL queries
> is likely to be useless, calculating number of affected rows in case of DML
> query can be sensible. On the other hand, it can turn out to be quite
> non-trivial task.  In its current state, box.dry_run("DML_Query") always
> returns 0 as row count. IMHO it can be done but as a follow-up

First about whether it makes sense to calculate any results like row_count at a
dry-run - of course it does not. Assume that you do it and appeared that row_count
would be 1. User relies on that and sends a new real request. Before the request
is executed, another one is scheduled and makes such changes, that the prepared one
now returns row_count 0, or even an error.

I think constant row_count 0 is better.

But DML will be able to return more meta in future via RETURNING as I know. Also
DML already now has other metadata - autoincremented field values. Perhaps
will have more. It means, that not only 'row_count' is returned. What to do with
them?

Perhaps it is worth not to return anything except meta for DQL. For DML and DDL
return an error, or empty meta. Assuming that DML in future will be able to
return meta, perhaps an empty one is better for now.

> 2. Should dry_run() except for parsing also bind variables? I guess no, since
> it results in byte-code (i.e. prepared statement) modification. On the other
> hand, it allows setting correct type for bindings (see sql_bind_type()). By
> default, all binding parameters have boolean type.

Yes, it changes bytecode. But it is harmless anyway. What other DBs do?

> 
> 3. Interface for local call. In current implementation I've added box.dry_run()
> in addition to traditional box.execute() method. Instead, we can make
> box.execute() accept third argument as it is handled in net.box.

The only thing I am sure about is that box and netbox APIs should be the same.

> Nikita Pettik (8):
>   port: increase padding of struct port
>   port: move struct port_sql to box/port.h
>   sql: remove sql_prepare_v2()
>   sql: refactor sql_prepare() and sqlPrepare()
>   sql: move sql_prepare() declaration to box/execute.h
>   refactoring: use sql_prepare() and sql_execute() in tx_process_sql()
>   netbox: allow passing options to :execute()
>   sql: introduce dry-run execution
> 
>  src/box/errcode.h        |   1 +
>  src/box/execute.c        | 104 ++++++++++++++++++++++++++++++++++++-----------
>  src/box/execute.h        |  61 ++++++++++++++++++++-------
>  src/box/iproto.cc        |  22 +++++++++-
>  src/box/lua/execute.c    |  37 ++++++++++++++++-
>  src/box/lua/net_box.c    |   7 ++--
>  src/box/lua/net_box.lua  |   3 --
>  src/box/port.h           |  26 ++++++++++++
>  src/box/sql/analyze.c    |  16 ++++----
>  src/box/sql/legacy.c     |   3 +-
>  src/box/sql/prepare.c    |  38 +++--------------
>  src/box/sql/sqlInt.h     |  16 --------
>  src/box/sql/vdbe.h       |   2 +-
>  src/box/sql/vdbeInt.h    |   1 -
>  src/box/sql/vdbeapi.c    |   9 ++--
>  src/box/sql/vdbeaux.c    |   5 +--
>  src/box/xrow.c           |  19 +++++++--
>  src/box/xrow.h           |   5 +++
>  src/lib/core/port.h      |   3 +-
>  test/box/misc.result     |   2 +
>  test/sql/iproto.result   |  93 +++++++++++++++++++++++++++++++++++++++++-
>  test/sql/iproto.test.lua |  19 +++++++++
>  22 files changed, 371 insertions(+), 121 deletions(-)
> 




More information about the Tarantool-patches mailing list