[tarantool-patches] Re: [PATCH v3 1/3] schema: add new system space for CHECK constraints

n.pettik korablev at tarantool.org
Thu Apr 25 23:38:42 MSK 2019



> On 16 Apr 2019, at 16:51, Kirill Shcherbatov <kshcherbatov at tarantool.org> wrote:
> 
> This patch introduces a new system space to persist check
> constraints. Format of the space:
> 
> _ck_constraint (space id = 357)
> [<constraint name> STR, <space id> UINT, <expression string>STR]
> 
> CK constraint is local to space, so every pair <CK name, space id>
> is unique (and it is PK in _ck_constraint space).
> 
> After insertion into this space, a new instance describing check
> constraint is created. Check constraint holds Expr AST tree.

Nit: AST includes ’tree’ word :) -> ‘holds AST of expression representing it’.

> While space features check constraints, it isn't allowed to
> be dropped. The :drop() space method firstly deletes all check
> constraints and then removes entry from _space.
> 
> Because space alter, index alter and space truncate operations
> case

-> cause, I guess.

> space recreation, introduced RebuildCkConstrains object
> that compile new ck constraint objects, replace and remove
> existent instances atomically(when some compilation fails,
> nothing changed).

Still don’t understand necessity of re-creating check constraints
in case of truncation or index alteration. Truncate can turn out to
be quite expensive in terms of performance since we will have
to rebuild and recompile all check, foreign key constraints and
SQL triggers.

> In fact, in scope of this patch we don't
> really need to recreate ck_constraint object in such situations
> (patch space_def pointer in AST like we did it before is enough
> now, but we would recompile VDBE that represents ck constraint
> in future that can fail).
> The main motivation for these changes is the ability to support
> ADD CHECK CONSTRAINT operation in the future.
> 

Why can’t we implement this statement without separate space?

> ---
> 
> /* }}} */
> @@ -4035,6 +4098,161 @@ on_replace_dd_fk_constraint(struct trigger * /* trigger*/, void *event)
> 	}
> }
> 
> +/** Create an instance of check constraint definition by tuple. */
> +static struct ck_constraint_def *
> +ck_constraint_def_new_from_tuple(struct tuple *tuple)
> +{
> +	uint32_t name_len;
> +	const char *name =
> +		tuple_field_str_xc(tuple, BOX_CK_CONSTRAINT_FIELD_NAME,
> +				   &name_len);
> +	if (name_len > BOX_NAME_MAX) {
> +		tnt_raise(ClientError, ER_CREATE_CK_CONSTRAINT,
> +			  tt_cstr(name, BOX_INVALID_NAME_MAX),
> +				  "check constraint name is too long");
> +	}
> +	identifier_check_xc(name, name_len);
> +	uint32_t expr_str_len;
> +	const char *expr_str =
> +		tuple_field_str_xc(tuple, BOX_CK_CONSTRAINT_FIELD_EXPR_STR,
> +				   &expr_str_len);
> +
> +	uint32_t name_offset, expr_str_offset;
> +	size_t ck_def_sz =
> +		ck_constraint_def_sizeof(name_len, expr_str_len, &name_offset,
> +					 &expr_str_offset);

Nit: function returns uint_32. Please, make types be consistent.

> diff --git a/src/box/ck_constraint.c b/src/box/ck_constraint.c
> new file mode 100644
> index 000000000..daeebb78b
> --- /dev/null
> +++ b/src/box/ck_constraint.c
> @@ -0,0 +1,115 @@
> 
> +#include "ck_constraint.h"
> +#include "errcode.h"
> +#include "small/rlist.h”

This header is already included in ck_constraint.h

> +#include "sql.h"
> +#include "sql/sqlInt.h”

It’s awful that we keep on including this header...

> +
> +/**
> + * Resolve space_def references for check constraint via AST
> + * tree traversal.
> + * @param ck_constraint Check constraint object to update.
> + * @param space_def Space definition to use.
> + * @retval 0 on success.
> + * @retval -1 on error.
> + */
> +static int
> +ck_constraint_resolve_space_def(struct Expr *expr,
> +				struct space_def *space_def)
> +{

I would call this method like “ck_constraint_resolve_names” or
“ck_constraint_resolve_field_names”. IMHO we can’t resolve
space definition and it may sound confusing...

> diff --git a/src/box/ck_constraint.h b/src/box/ck_constraint.h
> new file mode 100644
> index 000000000..615612605
> --- /dev/null
> +++ b/src/box/ck_constraint.h
> @@ -0,0 +1,163 @@
> 
> +/**
> + * Find check constraint object in space by given name and
> + * name_len.
> + * @param space The space to lookup check constraint.
> + * @param name The check constraint name.
> + * @param name_len The length of the name.
> + * @retval not NULL Check constrain if exists, NULL otherwise.
> + */
> +struct ck_constraint *

This function is used only in alter.cc, mb it is worth moving it to alter.cc
and make it static?

> +space_ck_constraint_by_name(struct space *space, const char *name,
> +			    uint32_t name_len);
> +
> 
> diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
> index f31cf7f2c..e01f500e6 100644
> --- a/src/box/lua/schema.lua
> +++ b/src/box/lua/schema.lua
> 
> local function create_sysview(source_id, target_id)
> @@ -735,6 +737,43 @@ local function upgrade_to_2_1_3()
>             id = id + 1
>         end
>     end
> +

This upgrade should be part of 2.2.0, I guess.

> +    -- In previous Tarantool releases check constraints were
> +    -- stored in space opts. Now we use separate space
> +    -- _ck_constraint for this purpose. Perform legacy data
> +    -- migration.
> diff --git a/src/box/schema_def.h b/src/box/schema_def.h
> +/** _ck_constraint fields. */
> +enum {
> +	BOX_CK_CONSTRAINT_FIELD_NAME = 0,
> +	BOX_CK_CONSTRAINT_FIELD_SPACE_ID = 1,
> +	BOX_CK_CONSTRAINT_FIELD_EXPR_STR = 2,

In addition to these properties mb it is worth adding
<constraint check time> field: ANSI says that every
constraint can be created with this option. I suggest to do
it now to avoid changing system spaces in future. Note that
_fk_constraints already has this attribute.

> diff --git a/src/box/sql/build.c b/src/box/sql/build.c
> index b1ed64a01..e55bbe43b 100644
> --- a/src/box/sql/build.c
> +++ b/src/box/sql/build.c
> @@ -47,6 +47,7 @@
> #include "vdbeInt.h"
> #include "tarantoolInt.h"
> #include "box/box.h"
> +#include "box/ck_constraint.h"
> #include "box/fk_constraint.h"
> #include "box/sequence.h"
> #include "box/session.h"
> @@ -643,37 +644,74 @@ primary_key_exit:
> void
> sql_add_check_constraint(struct Parse *parser)

Please, make name of this function consistent with
sql_create_foreign_key(). i.e. rename _add_check_...()
to sql_create_check_contraint() or vice versa.

> {
> -	struct create_ck_def *ck_def = &parser->create_ck_def;
> +	struct create_ck_def *create_ck_def = &parser->create_ck_def;
> +	struct ExprSpan *expr_span = create_ck_def->expr;
> +	sql_expr_delete(parser->db, expr_span->pExpr, false);

Looks inefficient: we delete expr right after its creation..
Could we come up with workaround to avoid this? I mean
cut string containing expression without allocating/releasing
memory for it and without filling in expr tree?

> +	uint32_t name_len = strlen(name);

Nit: strlen() return size_t type.

> +
> +	uint32_t expr_str_len = (uint32_t)(create_ck_def->expr->zEnd -
> +					   create_ck_def->expr->zStart);
> +	const char *expr_str = create_ck_def->expr->zStart;
> +
> +	/*
> +	 * Allocate memory for ck constraint parse structure and
> +	 * ck constraint definition as a single memory chunk on
> +	 * region:
> +	 *
> +	 *    [ck_parse][ck_def[name][expr_str]]
> +	 *         |_____^  |___^     ^
> +	 *                  |_________|
> +	 */
> +	uint32_t name_offset, expr_str_offset;
> +	size_t ck_def_sz =
> +		ck_constraint_def_sizeof(name_len, expr_str_len, &name_offset,
> +					 &expr_str_offset);

Nit: ck_constraint_def_sizeof() return uint32_t

Also why do we not use vdbe_emit_halt_with_presence_test()
during creation of check constraints?

tarantool> create table t2(id int primary key, constraint ck1 check(id > 0), constraint ck1 check(id < 0))
---
- error: Duplicate key exists in unique index 'primary' in space '_ck_constraint'
...
tarantool> create table t2(id int primary key, constraint fk1 foreign key(id) references t2, constraint fk1 foreign key(id) references t2)
---
- error: Constraint FK1 already exists
...

> diff --git a/src/box/sql/insert.c b/src/box/sql/insert.c
> index c2aac553f..a075d4dc3 100644
> --- a/src/box/sql/insert.c
> +++ b/src/box/sql/insert.c
> 
> @@ -1245,14 +1239,26 @@ xferOptimization(Parse * pParse,	/* Parser context */
> 		if (pSrcIdx == NULL)
> 			return 0;
> 	}
> -	/* Get server checks. */
> -	ExprList *pCheck_src = src->def->opts.checks;
> -	ExprList *pCheck_dest = dest->def->opts.checks;
> -	if (pCheck_dest != NULL &&
> -	    sqlExprListCompare(pCheck_src, pCheck_dest, -1) != 0) {
> -		/* Tables have different CHECK constraints.  Ticket #2252 */
> +	/*
> +	 * Dissallow the transfer optimization if the check
> +	 * constraints are differ.
> +	 */
> +	if (rlist_empty(&dest->ck_constraint) !=
> +	    rlist_empty(&src->ck_constraint))
> 		return 0;
> +	struct rlist *dst_ck_link = &dest->ck_constraint;
> +	struct ck_constraint *src_ck;
> +	rlist_foreach_entry(src_ck, &src->ck_constraint, link) {
> +		dst_ck_link = rlist_next(dst_ck_link);
> +		if (dst_ck_link == &dest->ck_constraint)
> +			return 0;
> +		struct ck_constraint *dest_ck =
> +			rlist_entry(dst_ck_link, struct ck_constraint, link);
> +		if (sqlExprCompare(src_ck->expr, dest_ck->expr, -1) != 0)
> +			return 0;
> 	}
> +	if (rlist_next(dst_ck_link) != &dest->ck_constraint)
> +		return 0;

After next patch this check could be removed: VDBE program for checking
consistency of check constraints would be generated automatically.
In SQLite xFer allows to avoid generation of this program. On the other hand,
if you added way to temporarily disable check constraints, we would able
to leave this check and re-enable them after query is executed. So, consider
way of disabling/enabling check constraints - it might be useful for other
cases as well.

> diff --git a/src/box/sql/parse_def.h b/src/box/sql/parse_def.h
> index a1af2bacd..ec434420d 100644
> --- a/src/box/sql/parse_def.h
> +++ b/src/box/sql/parse_def.h
> @@ -123,6 +123,20 @@ struct fk_constraint_parse {
> 	struct rlist link;
> };
> 
> +/**
> + * Structure representing check constraint appeared within
> + * CREATE TABLE statement. Used only during parsing.
> + */
> +struct ck_constraint_parse {
> +	/**
> +	 * Check constraint declared in <CREATE TABLE ...>
> +	 * statement. Must be coded after space creation.

Nit: Definition of check constraints…Mention that they
don’t need any cleanups, since they are allocated using region.

> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
> index b1ec8c758..531e29ed5 100644
> --- a/src/box/sql/select.c
> +++ b/src/box/sql/select.c
> @@ -6445,7 +6445,12 @@ sql_expr_extract_select(struct Parse *parser, struct Select *select)
> 	struct ExprList *expr_list = select->pEList;
> 	assert(expr_list->nExpr == 1);
> 	parser->parsed_ast_type = AST_TYPE_EXPR;
> -	parser->parsed_ast.expr = sqlExprDup(parser->db,
> -						 expr_list->a->pExpr,
> -						 EXPRDUP_REDUCE);
> +	/*
> +	 * Extract a copy of parsed expression.
> +	 * We cannot use EXPRDUP_REDUCE flag in sqlExprDup call
> +	 * because some compiled Expr (like Checks expressions)
> +	 * may require further resolve with sqlResolveExprNames.

Agh, still can’t point out what does this flag mean. Could you please briefly explain?

> +	 */
> +	parser->parsed_ast.expr =
> +		sqlExprDup(parser->db, expr_list->a->pExpr, 0);
> }
> 
> /**
> diff --git a/test/app-tap/tarantoolctl.test.lua b/test/app-tap/tarantoolctl.test.lua
> index 62a78d6bf..9ac0b5c69 100755
> --- a/test/app-tap/tarantoolctl.test.lua
> +++ b/test/app-tap/tarantoolctl.test.lua
> @@ -352,7 +352,7 @@ do
> 
>     local filler_code = [[
>         box.cfg{memtx_memory = 104857600, background=false}
> -        local space = box.schema.create_space("test")
> +        space = box.schema.create_space("test”)

Why this diff is needed?

> diff --git a/test/sql-tap/check.test.lua b/test/sql-tap/check.test.lua
> index b01afca7c..0dda3fac8 100755
> --- a/test/sql-tap/check.test.lua
> +++ b/test/sql-tap/check.test.lua
> @@ -319,7 +319,7 @@ test:do_catchsql_test(
>         );
>     ]], {
>         -- <check-3.1>
> -        1, "Failed to create space 'T3': Subqueries are prohibited in a CHECK constraint definition"
> +        1, "Failed to create check constraint 'CK_CONSTRAINT_1_T3': Subqueries are prohibited in a check constraint definition”

In some error messages "check” word is uppercased, in the rest - isn’t.
Please, make this rule be consistent. 

> diff --git a/test/sql/checks.test.lua b/test/sql/checks.test.lua
> index 5bfcf12f8..0cdfde2e9 100644
> --- a/test/sql/checks.test.lua
> +++ b/test/sql/checks.test.lua
> @@ -8,41 +8,90 @@ box.execute('pragma sql_default_engine=\''..engine..'\'')
> -- gh-3272: Move SQL CHECK into server
> --
> -opts = {checks = {{expr = 'X>5', name = 'ONE'}}}
> -format = {{name = 'X', type = 'unsigned'}}
> -t = {513, 1, 'test', 'memtx', 0, opts, format}
> -s = box.space._space:insert(t)
> -box.space._space:delete(513)
> +-- Invalid expression test.
> +box.space._ck_constraint:insert({'CK_CONSTRAINT_01', 513, 'X><5'})
> +-- Unexistent space test.
> +box.space._ck_constraint:insert({'CK_CONSTRAINT_01', 550, 'X<5'})
> +-- Pass integer instead of expression.
> +box.space._ck_constraint:insert({'CK_CONSTRAINT_01', 550, 666})

Please, pass valid space id in this particular test (to void confusion).

> +-- Test xferOptimization for space that have CK constraints.
> +box.execute("CREATE TABLE first (id FLOAT PRIMARY KEY CHECK(id < 5), a INT CONSTRAINT ONE CHECK(a < 5));")
> +box.execute("CREATE TABLE second (id FLOAT PRIMARY KEY CHECK(id < 5), a INT CONSTRAINT ONE CHECK(a < 5));")
> +box.execute("CREATE TABLE third (id FLOAT PRIMARY KEY, a INT CONSTRAINT ONE CHECK(a < 5));")
> +box.execute("INSERT INTO first VALUES(1, 1);")
> +box.execute("INSERT INTO first VALUES(2, 2);")
> +box.execute("INSERT INTO first VALUES(3, 3);")
> +box.execute("INSERT OR IGNORE INTO second SELECT * FROM first;")
> +box.execute("INSERT OR IGNORE INTO third SELECT * FROM first;")
> +box.execute("DELETE FROM second;")
> +box.execute("INSERT OR IGNORE INTO second SELECT * FROM third;”)

How does this check that xFer was really fired?

> +-- Test space creation rollback on spell error in ck constraint.
> +box.execute("CREATE TABLE first (id FLOAT PRIMARY KEY CHECK(id < 5), a INT CONSTRAINT ONE CHECK(a >< 5));")
> +box.space.FIRST == nil
> +box.space._ck_constraint:count() == 0
> +box.space.FIRST:drop()

Isn’t this redundant check? I mean box.space.FIRST == nil already
checks that space hasn’t been created.

> +-- Ck constraints are disallowed for spaces having no format.
> +s = box.schema.create_space('test')
> +_ = s:create_index('pk')
> +_ = box.space._ck_constraint:insert({'physics', s.id, 'X<Y'})
> +s:format({{name='X', type='integer'}, {name='Y', type='integer'}})
> +_ = box.space._ck_constraint:insert({'physics', s.id, 'X<Y'})
> +box.execute("INSERT INTO \"test\" VALUES(2, 1);")
> +s:format({{name='Y', type='integer'}, {name='X', type='integer'}})
> +box.execute("INSERT INTO \"test\" VALUES(1, 2);")
> +box.execute("INSERT INTO \"test\" VALUES(2, 1);")
> +s:truncate()
> +box.execute("INSERT INTO \"test\" VALUES(1, 2);")
> +s:format({})

Good test scenario.

Also, it would be nice to see a few complex expression. For instance,
involving CAST, built-in functions and other allowed in check exprs
constructions.

> diff --git a/test/sql/upgrade.test.lua b/test/sql/upgrade.test.lua
> index b76a8f373..cfda74a08 100644
> --- a/test/sql/upgrade.test.lua
> +++ b/test/sql/upgrade.test.lua
> @@ -62,6 +62,11 @@ s ~= nil
> i = box.space._index:select(s.id)
> i ~= nil
> i[1].opts.sql == nil
> +box.space._space:get(s.id).flags.checks == nil
> +check = box.space._ck_constraint:select()[1]
> +check ~= nil
> +check.name
> +check.expr_str

Am I right that last time snapshot to be upgraded was changed, it
was filled with space featuring check constraint? I see that you didn’t
change snapshot, but space has check constraint after upgrade.






More information about the Tarantool-patches mailing list