[Tarantool-patches] [PATCH v2 14/16] sql: introduce cache for prepared statemets

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Dec 4 01:51:09 MSK 2019


Thanks for the patch!

See 13 comments below.

On 20/11/2019 22:28, Nikita Pettik wrote:
> This patch introduces cache (as data structure) to handle prepared
> statements and a set of interface functions (insert, delete, find) to
> operate on it. Cache under the hood uses LRU eviction policy. It is
> implemented as a hash table with string keys (which contain original SQL
> statements) and prepared statements (struct sql_stmt which is an alias
> for struct Vdbe) as values. To realise LRU strategy we maintain list of
> nodes: head is the newest prepared statement, tail a candidate to be
> evicted. Size of cache is regulated via box.cfg{sql_cache_size}
> parameter. Cache is global to all sessions. To erase session manually,

1. Maybe 'erase cache', not session?

> one can set its size to 0. Default cache size is assumed to be 5 Mb.

2. In the cover letter you said, that it is not LRU. So what is it?
>From the code looks like a normal LRU.

> 
> Part of #2592
> ---
>  src/box/CMakeLists.txt          |   1 +
>  src/box/box.cc                  |  20 +++++
>  src/box/box.h                   |   1 +
>  src/box/errcode.h               |   1 +
>  src/box/lua/cfg.cc              |  12 +++
>  src/box/lua/load_cfg.lua        |   3 +
>  src/box/prep_stmt.c             | 186 ++++++++++++++++++++++++++++++++++++++++
>  src/box/prep_stmt.h             | 112 ++++++++++++++++++++++++
>  src/box/session.cc              |   1 +
>  src/box/sql.c                   |   3 +
>  test/app-tap/init_script.result |  37 ++++----
>  test/box/admin.result           |   2 +
>  test/box/cfg.result             |   7 ++
>  test/box/cfg.test.lua           |   1 +
>  test/box/misc.result            |   1 +
>  15 files changed, 370 insertions(+), 18 deletions(-)
>  create mode 100644 src/box/prep_stmt.c
>  create mode 100644 src/box/prep_stmt.h
> 
> diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
> index 5cd5cba81..8be3d982d 100644
> --- a/src/box/CMakeLists.txt
> +++ b/src/box/CMakeLists.txt
> @@ -126,6 +126,7 @@ add_library(box STATIC
>      sql.c
>      bind.c
>      execute.c
> +    prep_stmt.c

3. Could you please rename it to sql_stmt_cache.c/.h?

>      wal.c
>      call.c
>      merger.c
> diff --git a/src/box/lua/cfg.cc b/src/box/lua/cfg.cc
> index 4884ce013..42070fe49 100644
> --- a/src/box/lua/cfg.cc
> +++ b/src/box/lua/cfg.cc
> @@ -274,6 +274,17 @@ lbox_cfg_set_net_msg_max(struct lua_State *L)
>  	return 0;
>  }
>  
> +static int
> +lbox_set_prepared_stmt_cache_size(struct lua_State *L)
> +{
> +	try {
> +		box_set_prepared_stmt_cache_size();
> +	} catch (Exception *) {

4. Lets avoid exceptions in new code to reduce size of
future patches converting everything to C. Just make diag_set +
int return.

> +		luaT_error(L);
> +	}
> +	return 0;
> +}
> +
>  static int
>  lbox_cfg_set_worker_pool_threads(struct lua_State *L)
>  {
> diff --git a/src/box/prep_stmt.c b/src/box/prep_stmt.c
> new file mode 100644
> index 000000000..fe3b8244e
> --- /dev/null
> +++ b/src/box/prep_stmt.c
> @@ -0,0 +1,186 @@
> +#include "prep_stmt.h"
> +
> +#include "assoc.h"
> +#include "error.h"
> +#include "execute.h"
> +#include "session.h"
> +
> +/** Default cache size is 5 Mb. */
> +size_t prep_stmt_cache_size = 5 * 1024 * 1024;

5. This value is 'extern' in the header, but is never
used out of prep_stmt.c. And is not changed even here.

I think you can drop it. And make the initial size 0,
because box.cfg will call lbox_set_prepared_stmt_cache_size().
Even if a user didn't specify that option, it is called
anyway with the default value from load_cfg.lua.

> +
> +static struct prep_stmt_cache prep_stmt_cache;
> +
> +void
> +sql_prepared_stmt_cache_init()

6. Consider name 'sql_stmt_cache'. 'Prepared' is assumed.
Because sql_stmt by definition is a prepared statement.

> +{
> +	prep_stmt_cache.hash = mh_strnptr_new();
> +	if (prep_stmt_cache.hash == NULL)
> +		panic("out of memory");
> +	prep_stmt_cache.mem_quota = prep_stmt_cache_size;
> +	prep_stmt_cache.mem_used = 0;
> +	rlist_create(&prep_stmt_cache.cache_lru);
> +}
> +
> +static size_t
> +sql_cache_node_sizeof(struct sql_stmt *stmt)
> +{
> +	return sql_stmt_sizeof(stmt) + sizeof(struct stmt_cache_node *);

7. Maybe sizeof(struct stmt_cache_node)? Without '*'.

> +}
> +
> +/**
> + * Allocate new cache node containing given prepared statement.
> + * Add it to the LRU cache list. Account cache size enlargement.
> + */
> +static struct stmt_cache_node *
> +sql_cache_node_new(struct sql_stmt *stmt)

8. This is strange. You not just created the node,
but also added to the cache. But not in the hash table,
only in the lru list. How about to make this function
do only creation of the node, and add to lru + update
the size in sql_prepared_stmt_cache_insert?

The same about node_delete. Or make new/delete() add
to the hash table/remove from it too?

> +{
> +	struct stmt_cache_node *node = malloc(sizeof(*node));
> +	if (node == NULL) {
> +		diag_set(OutOfMemory, sizeof(*node), "malloc",
> +			 "struct stmt_cache_node");
> +		return NULL;
> +	}
> +	node->stmt = stmt;
> +	rlist_add(&prep_stmt_cache.cache_lru, &node->in_lru);
> +	prep_stmt_cache.mem_used += sql_cache_node_sizeof(stmt);
> +	return node;
> +}
> diff --git a/src/box/prep_stmt.h b/src/box/prep_stmt.h
> new file mode 100644
> index 000000000..31f3ff52b
> --- /dev/null
> +++ b/src/box/prep_stmt.h
> @@ -0,0 +1,112 @@
> +#include <stdint.h>
> +#include <stdio.h>
> +
> +#include "small/rlist.h"
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif
> +
> +extern size_t prep_stmt_cache_size;
> +
> +/**
> + * Global prepared statement cache which follows LRU
> + * eviction policy. Implemented as hash <str : stmt>
> + * and double linked list.
> + */
> +struct prep_stmt_cache {

9. Lets call it 'sql_stmt_cache'. To be consistent in having
this widespread alias of struct Vdbe.

> +	/** Size of memory currently occupied by prepared statements. */
> +	size_t mem_used;
> +	/**
> +	 * Max memory size that can be used for cache.
> +	 */
> +	size_t mem_quota;
> +	/** Query hash -> struct prepared_stmt hash.*/

10. A typo? Struct prepared_stmt does not exist, and
you store a statement as a value, not its hash.

> +	struct mh_strnptr_t *hash;

11. Please, rename 'hash' to 'cache', and add a comment,
that cache_lru and cache contain the same records. The
lru list is maintained only for the eviction policy.

> +	struct rlist cache_lru;
> +};
> +
> +struct stmt_cache_node {

12. Please, try to encapsulate this inside prep_stmt.c.
The API should become simpler, when you operate by
struct sql_stmt only, and don't need to worry about
cache nodes.

Maybe that is not possible, I don't know. I didn't
looked at the last two patches yet.

> +	/** Prepared statement itself. */
> +	struct sql_stmt *stmt;
> +	/**
> +	 * Link to the next node. Head is the newest, tail is
> +	 * a candidate to be evicted.
> +	 */
> +	struct rlist in_lru;
> +};
> +
> +struct sql_stmt;

13. This probably should go before first usage of struct sql_stmt,
which is above. As well as declaration of struct mh_strnptr_t.


More information about the Tarantool-patches mailing list