Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Nikita Pettik <korablev@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v2 15/16] box: introduce prepared statements
Date: Sat, 7 Dec 2019 00:18:05 +0100	[thread overview]
Message-ID: <dd020d5c-0934-9ba3-d7b5-4efc02d35c3f@tarantool.org> (raw)
In-Reply-To: <534c76888e033e2f37a93ee0f20a1b5ef806498d.1574277369.git.korablev@tarantool.org>

Thanks for the patch!

See 7 comments below.

On 20/11/2019 22:28, Nikita Pettik wrote:
> This patch introduces local prepared statements. Support of prepared
> statements in IProto protocol and netbox is added in the next patch.
> 
> Prepared statement is an opaque instance of SQL Virtual Machine. It can
> be executed several times without necessity of query recompilation. To
> achieve this one can use box.prepare(...) function. It takes string of
> SQL query to be prepared; returns extended set of meta-information
> including statement's string, parameter's types and names, types and
> names of columns of the resulting set, count of parameters to be bound.
> Lua object representing result of :prepare() invocation also features method
> :execute(). It corresponds to box.execute(stmt.sql_str), i.e. automatically
> substitutes string of prepared statement to be executed. Statements are
> held in prepared statement cache - for details see previous commit.
> After schema changes all prepared statement located in cache are
> considered to be expired - they are re-prepared automatically on demand.
> It is worth noting that box.execute() always attempts at finding
> statement to be executed in prepared statement cache. Thus, once statement
> is prepared, both box.execute() and :execute() methods will execute
> already compiled statement.
> 
> SQL cache memory limit is regulated by box{sql_cache_size} which can be
> set dynamically. Setting it to 0 completely erases cache.

1. The problem with cache size is that you gave users a way to
limit it, but didn't give a method to calculate how much memory
do they need for their statements. For example, a user knows
what statements he wants in the cache. How can he estimate the
cache size for the these statements? Does he need 100 bytes per
statement, 1KB, 100KB?

We either need a theoretical description how to calculate it,
even very rough, or maybe return a statement estimated size
on box.prepare(). Then a user at least can do some experiments
to estimate cache size.

Or you can add statistics to box.info about how many SQL
statements are prepared, and how much memory is occupied in
the cache. That is even better I think. In box.info.sql.

Maybe that is a good thing to do in a separate ticket. Don't
know.

> diff --git a/src/box/execute.c b/src/box/execute.c
> index d2a999099..2b5a9ba90 100644
> --- a/src/box/execute.c
> +++ b/src/box/execute.c> @@ -411,6 +412,59 @@ port_sql_dump_msgpack(struct port *port, struct obuf *out)
>  	return 0;
>  }
>  
> +static bool
> +sql_stmt_check_schema_version(struct sql_stmt *stmt)

2. 'Const' struct sql_stmt, please.

> +{
> +	return sql_stmt_schema_version(stmt) == box_schema_version();
> +}
> +
> +static int
> +sql_reprepare(struct stmt_cache_node **node)
> +{
> +	struct sql_stmt *stmt = (*node)->stmt;
> +	const char *sql_str = sql_stmt_query_str(stmt);
> +	struct sql_stmt *fresh_stmt;
> +	if (sql_compile(sql_str, strlen(sql_str), NULL,
> +			&fresh_stmt, NULL) != 0)
> +		return -1;
> +	sql_prepared_stmt_cache_delete(*node);
> +	if (sql_prepared_stmt_cache_insert(fresh_stmt) != 0) {
> +		sql_finalize(fresh_stmt);
> +		return -1;
> +	}
> +	sql_str = sql_stmt_query_str(fresh_stmt);
> +	*node = sql_prepared_stmt_cache_find(sql_str);
> +	return 0;
> +}
> +
> +int
> +sql_prepare(const char *sql, int len, struct port *port)
> +{
> +	struct stmt_cache_node *stmt_node = sql_prepared_stmt_cache_find(sql);
> +	struct sql_stmt *stmt;
> +	if (stmt_node == NULL) {
> +		if (sql_compile(sql, len, NULL, &stmt, NULL) != 0)
> +			return -1;
> +		if (sql_prepared_stmt_cache_insert(stmt) != 0) {
> +			sql_finalize(stmt);
> +			return -1;
> +		}
> +	} else {
> +		if (! sql_stmt_check_schema_version(stmt_node->stmt)) {
> +			if (sql_reprepare(&stmt_node) != 0)
> +				return -1;
> +		} else {
> +			sql_cache_stmt_refresh(stmt_node);
> +		}
> +		stmt = stmt_node->stmt;
> +	}
> +	enum sql_dump_format dump_format = sql_column_count(stmt) > 0 ?
> +					   DQL_PREPARE : DML_PREPARE;
> +	port_sql_create(port, stmt, dump_format, PREPARE);
> +

3. So, there is no any reference counting as I see?
Do you plan to add it? Because as you fairly mentioned,
in that implementation a statement, prepared in 100
different sessions, gets evicted the same as a statement
prepared only once.

However with reference counting there would be another
problem: how to increment it, actually? Each time
prepare() is called explicitly? And do nothing when only
execute() is called? And what to do when something needs
to be evicted?

> +	return 0;
> +}
> +
>  /**
>   * Execute prepared SQL statement.
>   *
> @@ -448,11 +502,47 @@ sql_execute(struct sql_stmt *stmt, struct port *port, struct region *region)
>  int
>  sql_prepare_and_execute(const char *sql, int len, const struct sql_bind *bind,
>  			uint32_t bind_count, struct port *port,
>  			struct region *region)
>  {
> +	struct stmt_cache_node *stmt_node = sql_prepared_stmt_cache_find(sql);
> +	if (stmt_node != NULL) {

4. In case the node is NULL, and it was prepared sometime before
that, and was not unprepared, then a user does not have a way how
to understand, that a new PREPARE is needed. That makes impossible
to rely on any long living 'prepare' handle on a client side.

A client either needs to call prepare() before doing multiple
execute() just in case, or we need to return something to signal
whether the execute() has hit the cache. For example, a flag
IPROTO_SQL_CACHE_HIT: boolean. Or something like that.

Or send a flag which you lead to autoreprepare. I don't know. But
certainly it should be left as is. Probably that is a subject for
a separate ticket.

> +		if (! sql_stmt_check_schema_version(stmt_node->stmt)) {
> +			if (sql_reprepare(&stmt_node) != 0)
> +				return -1;
> +		}
> +		if (! sql_stmt_busy(stmt_node->stmt)) {
> +			return sql_execute_prepared(stmt_node->stmt, bind,
> +						    bind_count, port, region);
> +		}
> +	}
> diff --git a/src/box/lua/execute.c b/src/box/lua/execute.c
> index 1b2f8d235..8d46399d4 100644
> --- a/src/box/lua/execute.c
> +++ b/src/box/lua/execute.c
> @@ -82,6 +146,55 @@ port_sql_dump_lua(struct port *port, struct lua_State *L, bool is_flat)
>  		}
>  		break;
>  	}
> +	case DQL_PREPARE: {
> +		/* Format is following:

5. Lets better stick to the one approved code
style: '/*' is on a separate line always. Here
and in the next 'case:'.

And now I see, why did you introduce dump format
enum. But I don't know how to comment it yet.
I think, on a next iteration after the other
fixes I will say more.

> diff --git a/test/sql/prepared.test.lua b/test/sql/prepared.test.lua
> new file mode 100644
> index 000000000..24e49832f
> --- /dev/null
> +++ b/test/sql/prepared.test.lua
> @@ -0,0 +1,196 @@
> +s = prepare("CREATE TRIGGER tr1 INSERT ON test1 FOR EACH ROW BEGIN DELETE FROM test1; END;")
> +execute(s.sql_str)
> +execute(s.sql_str)
> +
> +s = prepare("DROP TRIGGER tr1;")
> +execute(s.sql_str)
> +execute(s.sql_str)
> +
> +s = prepare("DROP TABLE test1;")
> +execute(s.sql_str)
> +execute(s.sql_str)

6. I saw Kostja asked about why do you increment
schema version on some new system space updates,
such as _trigger.

There is why - we generate DROP TABLE as manual
deletion of all space-related objects, and then of
the space itself. Space related objects include
triggers, ck, fk. And they are taken from struct
space during compilation. That is, if you won't
increment schema version on trigger/ck/fk update,
a prepared DROP TABLE won't take the changes into
account and won't be regenerated.

This test fails in case you don't update schema
version on _trigger update, right?

Talking of all the 'generate DROP TABLE by struct
space content' idea - I don't like it. But this
is how it works now. I future we will need to
open an iterator on the needed spaces at runtime.
Because lookup in struct space won't be available
on client side anyway.

> +f1 = fiber.new(implicit_yield)
> +f2 = fiber.new(implicit_yield)
> +f1:set_joinable(true)
> +f2:set_joinable(true)
> +
> +f1:join();
> +f2:join();
> +
> +test_run:cmd("setopt delimiter ''");
> +
> +box.space.TEST:drop()
> +box.schema.func.drop('SLEEP')
> 

7. On the whole, that is probably the best thing happened to SQL
in Tarantool for a very long time.

  parent reply	other threads:[~2019-12-06 23:18 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20 21:27 [Tarantool-patches] [PATCH v2 00/16] sql: " Nikita Pettik
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 01/16] sql: remove sql_prepare_v2() Nikita Pettik
2019-12-04 11:36   ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 02/16] sql: refactor sql_prepare() and sqlPrepare() Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-04 11:36   ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 03/16] sql: move sql_prepare() declaration to box/execute.h Nikita Pettik
2019-12-04 11:37   ` Konstantin Osipov
2019-12-05 13:32     ` Nikita Pettik
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 04/16] sql: rename sqlPrepare() to sql_compile() Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-13 13:49     ` Nikita Pettik
2019-12-04 11:39   ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 05/16] sql: move sql_finalize() to execute.h Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-04 11:39     ` Konstantin Osipov
2019-12-13 13:49     ` Nikita Pettik
2019-12-04 11:40   ` Konstantin Osipov
2019-12-05 13:37     ` Nikita Pettik
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 06/16] port: increase padding of struct port Nikita Pettik
2019-12-04 11:42   ` Konstantin Osipov
2019-12-13 13:54     ` Nikita Pettik
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 07/16] port: add dump format and request type to port_sql Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-13 13:55     ` Nikita Pettik
2019-12-04 11:52   ` Konstantin Osipov
2019-12-13 13:53     ` Nikita Pettik
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 08/16] sql: resurrect sql_bind_parameter_count() function Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-04 11:54   ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 09/16] sql: resurrect sql_bind_parameter_name() Nikita Pettik
2019-12-04 11:55   ` Konstantin Osipov
2019-12-04 11:55     ` Konstantin Osipov
2019-12-13 13:55     ` Nikita Pettik
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 10/16] sql: add sql_stmt_schema_version() Nikita Pettik
2019-12-04 11:57   ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 11/16] sql: introduce sql_stmt_sizeof() function Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-13 13:56     ` Nikita Pettik
2019-12-04 11:59   ` Konstantin Osipov
2019-12-13 13:56     ` Nikita Pettik
2019-12-13 14:15       ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 12/16] box: increment schema_version on ddl operations Nikita Pettik
2019-12-04 12:03   ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 13/16] sql: introduce sql_stmt_query_str() method Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-04 12:04   ` Konstantin Osipov
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 14/16] sql: introduce cache for prepared statemets Nikita Pettik
2019-12-03 22:51   ` Vladislav Shpilevoy
2019-12-04 12:11   ` Konstantin Osipov
2019-12-17 14:43   ` Kirill Yukhin
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 15/16] box: introduce prepared statements Nikita Pettik
2019-12-04 12:13   ` Konstantin Osipov
2019-12-06 23:18   ` Vladislav Shpilevoy [this message]
2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 16/16] netbox: " Nikita Pettik
2019-12-06 23:18   ` Vladislav Shpilevoy
2019-12-03 22:51 ` [Tarantool-patches] [PATCH v2 00/16] sql: " Vladislav Shpilevoy
2019-12-17 15:58 ` Georgy Kirichenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd020d5c-0934-9ba3-d7b5-4efc02d35c3f@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=korablev@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 15/16] box: introduce prepared statements' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox