From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Nikita Pettik <korablev@tarantool.org>, tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v2 15/16] box: introduce prepared statements Date: Sat, 7 Dec 2019 00:18:05 +0100 [thread overview] Message-ID: <dd020d5c-0934-9ba3-d7b5-4efc02d35c3f@tarantool.org> (raw) In-Reply-To: <534c76888e033e2f37a93ee0f20a1b5ef806498d.1574277369.git.korablev@tarantool.org> Thanks for the patch! See 7 comments below. On 20/11/2019 22:28, Nikita Pettik wrote: > This patch introduces local prepared statements. Support of prepared > statements in IProto protocol and netbox is added in the next patch. > > Prepared statement is an opaque instance of SQL Virtual Machine. It can > be executed several times without necessity of query recompilation. To > achieve this one can use box.prepare(...) function. It takes string of > SQL query to be prepared; returns extended set of meta-information > including statement's string, parameter's types and names, types and > names of columns of the resulting set, count of parameters to be bound. > Lua object representing result of :prepare() invocation also features method > :execute(). It corresponds to box.execute(stmt.sql_str), i.e. automatically > substitutes string of prepared statement to be executed. Statements are > held in prepared statement cache - for details see previous commit. > After schema changes all prepared statement located in cache are > considered to be expired - they are re-prepared automatically on demand. > It is worth noting that box.execute() always attempts at finding > statement to be executed in prepared statement cache. Thus, once statement > is prepared, both box.execute() and :execute() methods will execute > already compiled statement. > > SQL cache memory limit is regulated by box{sql_cache_size} which can be > set dynamically. Setting it to 0 completely erases cache. 1. The problem with cache size is that you gave users a way to limit it, but didn't give a method to calculate how much memory do they need for their statements. For example, a user knows what statements he wants in the cache. How can he estimate the cache size for the these statements? Does he need 100 bytes per statement, 1KB, 100KB? We either need a theoretical description how to calculate it, even very rough, or maybe return a statement estimated size on box.prepare(). Then a user at least can do some experiments to estimate cache size. Or you can add statistics to box.info about how many SQL statements are prepared, and how much memory is occupied in the cache. That is even better I think. In box.info.sql. Maybe that is a good thing to do in a separate ticket. Don't know. > diff --git a/src/box/execute.c b/src/box/execute.c > index d2a999099..2b5a9ba90 100644 > --- a/src/box/execute.c > +++ b/src/box/execute.c> @@ -411,6 +412,59 @@ port_sql_dump_msgpack(struct port *port, struct obuf *out) > return 0; > } > > +static bool > +sql_stmt_check_schema_version(struct sql_stmt *stmt) 2. 'Const' struct sql_stmt, please. > +{ > + return sql_stmt_schema_version(stmt) == box_schema_version(); > +} > + > +static int > +sql_reprepare(struct stmt_cache_node **node) > +{ > + struct sql_stmt *stmt = (*node)->stmt; > + const char *sql_str = sql_stmt_query_str(stmt); > + struct sql_stmt *fresh_stmt; > + if (sql_compile(sql_str, strlen(sql_str), NULL, > + &fresh_stmt, NULL) != 0) > + return -1; > + sql_prepared_stmt_cache_delete(*node); > + if (sql_prepared_stmt_cache_insert(fresh_stmt) != 0) { > + sql_finalize(fresh_stmt); > + return -1; > + } > + sql_str = sql_stmt_query_str(fresh_stmt); > + *node = sql_prepared_stmt_cache_find(sql_str); > + return 0; > +} > + > +int > +sql_prepare(const char *sql, int len, struct port *port) > +{ > + struct stmt_cache_node *stmt_node = sql_prepared_stmt_cache_find(sql); > + struct sql_stmt *stmt; > + if (stmt_node == NULL) { > + if (sql_compile(sql, len, NULL, &stmt, NULL) != 0) > + return -1; > + if (sql_prepared_stmt_cache_insert(stmt) != 0) { > + sql_finalize(stmt); > + return -1; > + } > + } else { > + if (! sql_stmt_check_schema_version(stmt_node->stmt)) { > + if (sql_reprepare(&stmt_node) != 0) > + return -1; > + } else { > + sql_cache_stmt_refresh(stmt_node); > + } > + stmt = stmt_node->stmt; > + } > + enum sql_dump_format dump_format = sql_column_count(stmt) > 0 ? > + DQL_PREPARE : DML_PREPARE; > + port_sql_create(port, stmt, dump_format, PREPARE); > + 3. So, there is no any reference counting as I see? Do you plan to add it? Because as you fairly mentioned, in that implementation a statement, prepared in 100 different sessions, gets evicted the same as a statement prepared only once. However with reference counting there would be another problem: how to increment it, actually? Each time prepare() is called explicitly? And do nothing when only execute() is called? And what to do when something needs to be evicted? > + return 0; > +} > + > /** > * Execute prepared SQL statement. > * > @@ -448,11 +502,47 @@ sql_execute(struct sql_stmt *stmt, struct port *port, struct region *region) > int > sql_prepare_and_execute(const char *sql, int len, const struct sql_bind *bind, > uint32_t bind_count, struct port *port, > struct region *region) > { > + struct stmt_cache_node *stmt_node = sql_prepared_stmt_cache_find(sql); > + if (stmt_node != NULL) { 4. In case the node is NULL, and it was prepared sometime before that, and was not unprepared, then a user does not have a way how to understand, that a new PREPARE is needed. That makes impossible to rely on any long living 'prepare' handle on a client side. A client either needs to call prepare() before doing multiple execute() just in case, or we need to return something to signal whether the execute() has hit the cache. For example, a flag IPROTO_SQL_CACHE_HIT: boolean. Or something like that. Or send a flag which you lead to autoreprepare. I don't know. But certainly it should be left as is. Probably that is a subject for a separate ticket. > + if (! sql_stmt_check_schema_version(stmt_node->stmt)) { > + if (sql_reprepare(&stmt_node) != 0) > + return -1; > + } > + if (! sql_stmt_busy(stmt_node->stmt)) { > + return sql_execute_prepared(stmt_node->stmt, bind, > + bind_count, port, region); > + } > + } > diff --git a/src/box/lua/execute.c b/src/box/lua/execute.c > index 1b2f8d235..8d46399d4 100644 > --- a/src/box/lua/execute.c > +++ b/src/box/lua/execute.c > @@ -82,6 +146,55 @@ port_sql_dump_lua(struct port *port, struct lua_State *L, bool is_flat) > } > break; > } > + case DQL_PREPARE: { > + /* Format is following: 5. Lets better stick to the one approved code style: '/*' is on a separate line always. Here and in the next 'case:'. And now I see, why did you introduce dump format enum. But I don't know how to comment it yet. I think, on a next iteration after the other fixes I will say more. > diff --git a/test/sql/prepared.test.lua b/test/sql/prepared.test.lua > new file mode 100644 > index 000000000..24e49832f > --- /dev/null > +++ b/test/sql/prepared.test.lua > @@ -0,0 +1,196 @@ > +s = prepare("CREATE TRIGGER tr1 INSERT ON test1 FOR EACH ROW BEGIN DELETE FROM test1; END;") > +execute(s.sql_str) > +execute(s.sql_str) > + > +s = prepare("DROP TRIGGER tr1;") > +execute(s.sql_str) > +execute(s.sql_str) > + > +s = prepare("DROP TABLE test1;") > +execute(s.sql_str) > +execute(s.sql_str) 6. I saw Kostja asked about why do you increment schema version on some new system space updates, such as _trigger. There is why - we generate DROP TABLE as manual deletion of all space-related objects, and then of the space itself. Space related objects include triggers, ck, fk. And they are taken from struct space during compilation. That is, if you won't increment schema version on trigger/ck/fk update, a prepared DROP TABLE won't take the changes into account and won't be regenerated. This test fails in case you don't update schema version on _trigger update, right? Talking of all the 'generate DROP TABLE by struct space content' idea - I don't like it. But this is how it works now. I future we will need to open an iterator on the needed spaces at runtime. Because lookup in struct space won't be available on client side anyway. > +f1 = fiber.new(implicit_yield) > +f2 = fiber.new(implicit_yield) > +f1:set_joinable(true) > +f2:set_joinable(true) > + > +f1:join(); > +f2:join(); > + > +test_run:cmd("setopt delimiter ''"); > + > +box.space.TEST:drop() > +box.schema.func.drop('SLEEP') > 7. On the whole, that is probably the best thing happened to SQL in Tarantool for a very long time.
next prev parent reply other threads:[~2019-12-06 23:18 UTC|newest] Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-20 21:27 [Tarantool-patches] [PATCH v2 00/16] sql: " Nikita Pettik 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 01/16] sql: remove sql_prepare_v2() Nikita Pettik 2019-12-04 11:36 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 02/16] sql: refactor sql_prepare() and sqlPrepare() Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-04 11:36 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 03/16] sql: move sql_prepare() declaration to box/execute.h Nikita Pettik 2019-12-04 11:37 ` Konstantin Osipov 2019-12-05 13:32 ` Nikita Pettik 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 04/16] sql: rename sqlPrepare() to sql_compile() Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-13 13:49 ` Nikita Pettik 2019-12-04 11:39 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 05/16] sql: move sql_finalize() to execute.h Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-04 11:39 ` Konstantin Osipov 2019-12-13 13:49 ` Nikita Pettik 2019-12-04 11:40 ` Konstantin Osipov 2019-12-05 13:37 ` Nikita Pettik 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 06/16] port: increase padding of struct port Nikita Pettik 2019-12-04 11:42 ` Konstantin Osipov 2019-12-13 13:54 ` Nikita Pettik 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 07/16] port: add dump format and request type to port_sql Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-13 13:55 ` Nikita Pettik 2019-12-04 11:52 ` Konstantin Osipov 2019-12-13 13:53 ` Nikita Pettik 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 08/16] sql: resurrect sql_bind_parameter_count() function Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-04 11:54 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 09/16] sql: resurrect sql_bind_parameter_name() Nikita Pettik 2019-12-04 11:55 ` Konstantin Osipov 2019-12-04 11:55 ` Konstantin Osipov 2019-12-13 13:55 ` Nikita Pettik 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 10/16] sql: add sql_stmt_schema_version() Nikita Pettik 2019-12-04 11:57 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 11/16] sql: introduce sql_stmt_sizeof() function Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-13 13:56 ` Nikita Pettik 2019-12-04 11:59 ` Konstantin Osipov 2019-12-13 13:56 ` Nikita Pettik 2019-12-13 14:15 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 12/16] box: increment schema_version on ddl operations Nikita Pettik 2019-12-04 12:03 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 13/16] sql: introduce sql_stmt_query_str() method Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-04 12:04 ` Konstantin Osipov 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 14/16] sql: introduce cache for prepared statemets Nikita Pettik 2019-12-03 22:51 ` Vladislav Shpilevoy 2019-12-04 12:11 ` Konstantin Osipov 2019-12-17 14:43 ` Kirill Yukhin 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 15/16] box: introduce prepared statements Nikita Pettik 2019-12-04 12:13 ` Konstantin Osipov 2019-12-06 23:18 ` Vladislav Shpilevoy [this message] 2019-11-20 21:28 ` [Tarantool-patches] [PATCH v2 16/16] netbox: " Nikita Pettik 2019-12-06 23:18 ` Vladislav Shpilevoy 2019-12-03 22:51 ` [Tarantool-patches] [PATCH v2 00/16] sql: " Vladislav Shpilevoy 2019-12-17 15:58 ` Georgy Kirichenko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=dd020d5c-0934-9ba3-d7b5-4efc02d35c3f@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 15/16] box: introduce prepared statements' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox